Skip to content

Commit 6f3c73c

Browse files
authored
Feature: support custom int/float type (#7)
* support custom int/float type
1 parent 3dc1faf commit 6f3c73c

File tree

12 files changed

+959
-181
lines changed

12 files changed

+959
-181
lines changed

README.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,58 @@ assert r == {'country': 'COUNTRY', 'isp': 'ISP'}
2626

2727
## Examples
2828
see [csv_to_mmdb.py](./examples/csv_to_mmdb.py)
29+
Here is a professional and clear translation of the README.md section from Chinese into English:
30+
31+
## Using the Java Client
32+
33+
### TLDR
34+
35+
When generating an MMDB file for use with the Java client, you must specify the `int_type`:
36+
37+
```python
38+
from mmdb_writer import MMDBWriter
39+
40+
writer = MMDBWriter(int_type='int32')
41+
```
42+
43+
Alternatively, you can explicitly specify data types using the [Type Enforcement](#type-enforcement) section.
44+
45+
### Underlying Principles
46+
47+
In Java, when deserializing to a structure, the numeric types will use the original MMDB numeric types. The specific
48+
conversion relationships are as follows:
49+
50+
| mmdb type | java type |
51+
|--------------|------------|
52+
| float (15) | Float |
53+
| double (3) | Double |
54+
| int32 (8) | Integer |
55+
| uint16 (5) | Integer |
56+
| uint32 (6) | Long |
57+
| uint64 (9) | BigInteger |
58+
| uint128 (10) | BigInteger |
59+
60+
When using the Python writer to generate an MMDB file, by default, it converts integers to the corresponding MMDB type
61+
based on the size of the `int`. For instance, `int(1)` would convert to `uint16`, and `int(2**16+1)` would convert
62+
to `uint32`. This may cause deserialization failures in Java clients. Therefore, it is necessary to specify
63+
the `int_type` parameter when generating MMDB files to define the numeric type accurately.
64+
65+
## Type Enforcement
66+
67+
MMDB supports a variety of numeric types such as `int32`, `uint16`, `uint32`, `uint64`, `uint128` for integers,
68+
and `f32`, `f64` for floating points, while Python only has one integer type and one float type (actually `f64`).
69+
70+
Therefore, when generating an MMDB file, you need to specify the `int_type` parameter to define the numeric type of the
71+
MMDB file. The behaviors for different `int_type` settings are:
72+
73+
| int_type | Behavior |
74+
|----------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
75+
| auto (default) | Automatically selects the MMDB numeric type based on the value size. <br/>Rules: <br/>`int32` for value < 0 <br/>`uint16` for 0 <= value < 2^16<br/>`uint32` for 2^16 <= value < 2^32<br/>`uint64` for 2^32 <= value < 2^64<br/> `uint128` for value >= 2^64. |
76+
| i32 | Stores all integer types as `int32`. |
77+
| u16 | Stores all integer types as `uint16`. |
78+
| u32 | Stores all integer types as `uint32`. |
79+
| u64 | Stores all integer types as `uint64`. |
80+
| u128 | Stores all integer types as `uint128`. |
2981

3082

3183
## Reference:

examples/csv_to_mmdb.py

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,25 +8,30 @@
88

99

1010
def main():
11-
writer = MMDBWriter(4, 'Test.GeoIP', languages=['EN'], description="Test IP library")
11+
writer = MMDBWriter(
12+
4, "Test.GeoIP", languages=["EN"], description="Test IP library"
13+
)
1214
data = defaultdict(list)
1315

1416
# merge cidr
15-
with open('fake_ip_info.csv', 'r') as f:
17+
with open("fake_ip_info.csv", "r") as f:
1618
reader = csv.DictReader(f)
1719
for line in reader:
18-
data[(line['country'], line['isp'])].append(IPNetwork(f'{line["ip"]}/{line["prefixlen"]}'))
20+
data[(line["country"], line["isp"])].append(
21+
IPNetwork(f'{line["ip"]}/{line["prefixlen"]}')
22+
)
1923
for index, cidrs in data.items():
20-
writer.insert_network(IPSet(cidrs), {'country': index[0], 'isp': index[1]})
21-
writer.to_db_file('fake_ip_library.mmdb')
24+
writer.insert_network(IPSet(cidrs), {"country": index[0], "isp": index[1]})
25+
writer.to_db_file("fake_ip_library.mmdb")
2226

2327

2428
def test_read():
2529
import maxminddb
26-
m = maxminddb.open_database('fake_ip_library.mmdb')
27-
r = m.get('3.1.1.1')
30+
31+
m = maxminddb.open_database("fake_ip_library.mmdb")
32+
r = m.get("3.1.1.1")
2833
print(r)
2934

3035

31-
if __name__ == '__main__':
36+
if __name__ == "__main__":
3237
main()

0 commit comments

Comments
 (0)