Skip to content

Commit bc86e7a

Browse files
committed
Improve documentation and clippy.
- Implemented FromStr properly for Manifest to allow for the use of parse()
1 parent 22df8a2 commit bc86e7a

File tree

6 files changed

+78
-25
lines changed

6 files changed

+78
-25
lines changed

README.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,3 +52,22 @@ async fn main() {
5252
}
5353
}
5454
```
55+
56+
## A word about server backends
57+
58+
There are three valid options for backends for a given server. These are:
59+
60+
- `CVMFS`: This backend requires `cvmfs/info/v1/repositories.json` to be present on the server. Scrape fails if it is missing.
61+
- `S3`: Does not even attempt to fetch `cvmfs/info/v1/repositories.json`. Note that if any server has S3 as a backend a list of repositories *must* be passed to the scraper as there is no other way to determine the list of repositories for S3 servers. Due to the async scraping of all servers, there is currently no support for falling back on repositories detected from other server types (including possibly the Stratum0).
62+
- `AutoDetect`: This backend Aatempts to fetch `cvmfs/info/v1/repositories.json` but does not fail if it is missing. If the scraper fails to fetch the file, the backend will be assumed to be S3. If the list of repositories is empty, the scraper will return an empty list. If your S3 server has no repositories, setting the backend to AutoDetect will allow the scraper to continue without failing.
63+
64+
For populated servers, the field `backend_detected` will be set to the detected backend, which for explicit S3 or CVMFS servers will be the same as requested type.
65+
66+
## What repositories are scraped?
67+
68+
- For servers that are set to or detected as CVMFS, the scraper will scrape the union of the detected and configurations explicitly stated repositories.
69+
- For servers that are set to or detected as S3, only the explicitly stated repositories will be scraped (and the scraper will fail if the server type is explicitly set to S3 and no repositories are passed).
70+
71+
## License
72+
73+
Licensed under the MIT license. See the LICENSE file for details.

src/lib.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -73,8 +73,8 @@ pub mod errors;
7373

7474
pub use errors::{CVMFSScraperError, HostnameError, ManifestError, ScrapeError};
7575
pub use models::{
76-
FailedServer, Hostname, PopulatedRepositoryOrReplica, PopulatedServer, ScrapedServer, Server,
77-
ServerBackendType, ServerType,
76+
FailedServer, Hostname, Manifest, MaybeRfc2822DateTime, PopulatedRepositoryOrReplica,
77+
PopulatedServer, ScrapedServer, Server, ServerBackendType, ServerType,
7878
};
7979

8080
use futures::future::join_all;

src/models/cvmfs_published.rs

Lines changed: 28 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -8,22 +8,28 @@ use crate::utilities::{parse_boolean_field, parse_hex_field, parse_number_field}
88
/// The manifest of a repository or replica.
99
///
1010
/// The fields are:
11-
/// - C: Cryptographic hash of the repository’s current root catalog
12-
/// - B: Size of the root file catalog in bytes
13-
/// - A: true if the catalog should be fetched under its alternative name
14-
/// - R: MD5 hash of the repository’s current root path (usually always d41d8cd98f00b204e9800998ecf8427e)
15-
/// - X: Cryptographic hash of the signing certificate
16-
/// - G: true if the repository is garbage-collectable
17-
/// - H: Cryptographic hash of the repository’s named tag history database
18-
/// - T: Unix timestamp of this particular revision
19-
/// - D: Time To Live (TTL) of the root catalog
20-
/// - S: Revision number of this published revision
21-
/// - N: The full name of the manifested repository
22-
/// - M: Cryptographic hash of the repository JSON metadata
23-
/// - Y: Cryptographic hash of the reflog checksum
24-
/// - L: currently unused (reserved for micro catalogs)
25-
/// signature - In order to provide authoritative information about a repository publisher, the repository manifest is signed by an X.509 certificate together with its private key.
26-
11+
/// - c: Cryptographic hash of the repository’s current root catalog
12+
/// - b: Size of the root file catalog in bytes
13+
/// - a: true if the catalog should be fetched under its alternative name
14+
/// - r: MD5 hash of the repository’s current root path (usually always d41d8cd98f00b204e9800998ecf8427e)
15+
/// - x: Cryptographic hash of the signing certificate
16+
/// - g: true if the repository is garbage-collectable
17+
/// - h: Cryptographic hash of the repository’s named tag history database
18+
/// - t: Unix timestamp of this particular revision
19+
/// - d: Time To Live (TTL) of the root catalog
20+
/// - s: Revision number of this published revision
21+
/// - n: The full name of the manifested repository
22+
/// - m: Cryptographic hash of the repository JSON metadata
23+
/// - y: Cryptographic hash of the reflog checksum
24+
/// - l: currently unused (reserved for micro catalogs)
25+
/// - signature: In order to provide authoritative information about a repository publisher, the
26+
/// repository manifest is signed by an X.509 certificate together with its private key.
27+
/// This field is not validated by this library.
28+
///
29+
/// Note that the field names are lowercase, but the field names in the manifest itself are uppercase.
30+
///
31+
/// See https://cvmfs.readthedocs.io/en/stable/cpt-details.html#repository-manifest-cvmfspublished for
32+
/// more information.
2733
#[derive(Deserialize, Serialize, Clone, PartialEq)]
2834
pub struct Manifest {
2935
pub c: HexString,
@@ -68,8 +74,10 @@ impl std::fmt::Debug for Manifest {
6874
}
6975
}
7076

71-
impl Manifest {
72-
pub fn from_str(content: &str) -> Result<Self, ManifestError> {
77+
impl std::str::FromStr for Manifest {
78+
type Err = ManifestError;
79+
80+
fn from_str(content: &str) -> Result<Self, Self::Err> {
7381
let mut data: HashMap<char, String> = HashMap::new();
7482
let mut signature: String = String::new();
7583
let mut is_signature = false;
@@ -111,7 +119,9 @@ impl Manifest {
111119

112120
Ok(manifest)
113121
}
122+
}
114123

124+
impl Manifest {
115125
pub fn display(&self) {
116126
println!(" Manifest for repository: {}", self.n);
117127
println!(" Root catalog hash: {}", self.c);

src/models/generic.rs

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,16 @@ impl<'de> Deserialize<'de> for HexString {
106106
}
107107
}
108108

109+
/// A wrapped optional string that may be a RFC 2822 date-time.
110+
///
111+
/// Due to the fact that the date-time fields generated into the CVMFS JSON files
112+
/// are produced with the `date` command, they may be localized to the system
113+
/// that generated them. This means that the date-time fields may not parsable
114+
/// with any degree of sanity.
115+
///
116+
/// To offer both the option of a time-parsed field and the raw string, we store
117+
/// the string itself and provide a method (`try_into_datetime`) to attempt to
118+
/// parse the string into a `DateTime<Utc>`.
109119
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, Default)]
110120
pub struct MaybeRfc2822DateTime(pub Option<String>);
111121

src/models/mod.rs

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,8 @@ mod meta_json;
55
mod repositories_json;
66
mod servers;
77

8-
pub use generic::{HexString, Hostname};
8+
pub use cvmfs_published::Manifest;
9+
pub use generic::{HexString, Hostname, MaybeRfc2822DateTime};
910
pub use servers::{
1011
FailedServer, PopulatedRepositoryOrReplica, PopulatedServer, ScrapedServer, Server,
1112
ServerBackendType, ServerType,

src/models/servers.rs

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -548,10 +548,7 @@ impl RepositoryOrReplica {
548548
self.server.hostname.0, self.name
549549
);
550550
let response = client.get(url).send().await?;
551-
let content = response.error_for_status()?.text().await?;
552-
let content = content.as_str();
553-
// println!("{}", content);
554-
Manifest::from_str(content)
551+
response.error_for_status()?.text().await?.parse()
555552
}
556553

557554
async fn fetch_repository_status_json(
@@ -569,6 +566,22 @@ impl RepositoryOrReplica {
569566
}
570567
}
571568

569+
/// A populated repository or replica object.
570+
///
571+
/// This object represents a CVMFS repository or replica that has been scraped for information about
572+
/// the repository. For fetching the revision of the repository, one can use the `revision` method
573+
/// as a shortcut to get the revision from the manifest.
574+
///
575+
/// Fields:
576+
///
577+
/// - name: The name of the repository
578+
/// - manifest: The manifest of the repository
579+
/// - last_snapshot: The last time a snapshot was taken (optional)
580+
/// - last_gc: The last time garbage collection was run (optional)
581+
///
582+
/// The MaybeRfc2822DateTime type is used to represent a date and time that may or may not be present,
583+
/// and may or may not be in the RFC 2822 format. See the documentation for the MaybeRfc2822DateTime
584+
/// type for more information.
572585
#[derive(Debug, Serialize, Clone, PartialEq)]
573586
pub struct PopulatedRepositoryOrReplica {
574587
pub name: String,

0 commit comments

Comments
 (0)