Skip to content

Commit 3059db6

Browse files
authored
Fix/14 caching issues using dictionary (#15)
fix: caching point issue (#14)
1 parent 6d83669 commit 3059db6

File tree

2 files changed

+44
-33
lines changed

2 files changed

+44
-33
lines changed

README.md

Lines changed: 34 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -10,14 +10,16 @@
1010

1111
# MaxMind-DB-Writer-python
1212

13-
Make `mmdb` format ip library file which can be read by [`maxmind` official language reader](https://dev.maxmind.com/geoip/geoip2/downloadable/)
13+
Make `mmdb` format ip library file which can be read by [
14+
`maxmind` official language reader](https://dev.maxmind.com/geoip/geoip2/downloadable/)
1415

15-
~~[The official perl writer](https://github.com/maxmind/MaxMind-DB-Writer-perl) was written in perl,
16-
which was difficult to customize.
16+
~~[The official perl writer](https://github.com/maxmind/MaxMind-DB-Writer-perl) was written in perl,
17+
which was difficult to customize.
1718
So I implemented the `MaxmindDB format` ip library in python language.~~
1819

19-
MaxMind has now released an official Go version of the MMDB writer.
20-
If you prefer using Go, you can check out the official Go implementation [mmdbwriter](https://github.com/maxmind/mmdbwriter).
20+
MaxMind has now released an official Go version of the MMDB writer.
21+
If you prefer using Go, you can check out the official Go
22+
implementation [mmdbwriter](https://github.com/maxmind/mmdbwriter).
2123
This project still provides a Python alternative for those who need it.
2224

2325
## Install
@@ -27,30 +29,35 @@ pip install -U mmdb_writer
2729
```
2830

2931
## Usage
32+
3033
```python
3134
from netaddr import IPSet
3235

3336
from mmdb_writer import MMDBWriter
37+
3438
writer = MMDBWriter()
3539

3640
writer.insert_network(IPSet(['1.1.0.0/24', '1.1.1.0/24']), {'country': 'COUNTRY', 'isp': 'ISP'})
3741
writer.to_db_file('test.mmdb')
3842

3943
import maxminddb
44+
4045
m = maxminddb.open_database('test.mmdb')
4146
r = m.get('1.1.1.1')
4247
assert r == {'country': 'COUNTRY', 'isp': 'ISP'}
4348
```
4449

4550
## Examples
51+
4652
see [csv_to_mmdb.py](./examples/csv_to_mmdb.py)
4753
Here is a professional and clear translation of the README.md section from Chinese into English:
4854

4955
## Using the Java Client
5056

51-
### TLDR
57+
If you are using the Java client, you need to be careful to set the `int_type` parameter so that Java correctly
58+
recognizes the integer type in the MMDB file.
5259

53-
When generating an MMDB file for use with the Java client, you must specify the `int_type`:
60+
Example:
5461

5562
```python
5663
from mmdb_writer import MMDBWriter
@@ -65,15 +72,15 @@ Alternatively, you can explicitly specify data types using the [Type Enforcement
6572
In Java, when deserializing to a structure, the numeric types will use the original MMDB numeric types. The specific
6673
conversion relationships are as follows:
6774

68-
| mmdb type | java type |
69-
|--------------|------------|
70-
| float (15) | Float |
71-
| double (3) | Double |
72-
| int32 (8) | Integer |
73-
| uint16 (5) | Integer |
74-
| uint32 (6) | Long |
75-
| uint64 (9) | BigInteger |
76-
| uint128 (10) | BigInteger |
75+
| mmdb type | java type |
76+
|-----------|------------|
77+
| float | Float |
78+
| double | Double |
79+
| int32 | Integer |
80+
| uint16 | Integer |
81+
| uint32 | Long |
82+
| uint64 | BigInteger |
83+
| uint128 | BigInteger |
7784

7885
When using the Python writer to generate an MMDB file, by default, it converts integers to the corresponding MMDB type
7986
based on the size of the `int`. For instance, `int(1)` would convert to `uint16`, and `int(2**16+1)` would convert
@@ -97,7 +104,17 @@ MMDB file. The behaviors for different `int_type` settings are:
97104
| u64 | Stores all integer types as `uint64`. |
98105
| u128 | Stores all integer types as `uint128`. |
99106

107+
If you want to use different int types for different scenarios, you can use type wrapping:
108+
109+
```python
110+
from mmdb_writer import MMDBWriter, MmdbI32, MmdbF32
111+
112+
writer = MMDBWriter()
113+
# the value of field "i32" will be stored as int32 type
114+
writer.insert_network(IPSet(["1.0.0.0/24"]), {"i32": MmdbI32(128), "f32": MmdbF32(1.22)})
115+
```
116+
117+
## Reference:
100118

101-
## Reference:
102119
- [MaxmindDB format](http://maxmind.github.io/MaxMind-DB/)
103120
- [geoip-mmdb](https://github.com/i-rinat/geoip-mmdb)

mmdb_writer.py

Lines changed: 10 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -378,11 +378,12 @@ def encode_meta(self, meta):
378378
res += self.encode(v, meta_type.get(k))
379379
return res
380380

381-
def encode(self, value, type_id=None):
381+
def encode(self, value, type_id=None, return_offset=False):
382382
if self.cache:
383383
cache_key = self._freeze(value)
384384
try:
385-
return self.data_cache[cache_key]
385+
offset = self.data_cache[cache_key]
386+
return offset if return_offset else self._encode_pointer(offset)
386387
except KeyError:
387388
pass
388389

@@ -399,18 +400,11 @@ def encode(self, value, type_id=None):
399400
res = encoder(value)
400401

401402
if self.cache:
402-
# add to cache
403-
if type_id == 1:
404-
self.data_list.append(res)
405-
self.data_pointer += len(res)
406-
return res
407-
else:
408-
self.data_list.append(res)
409-
pointer_position = self.data_pointer
410-
self.data_pointer += len(res)
411-
pointer = self.encode(pointer_position, 1)
412-
self.data_cache[cache_key] = pointer
413-
return pointer
403+
self.data_list.append(res)
404+
offset = self.data_pointer
405+
self.data_pointer += len(res)
406+
self.data_cache[cache_key] = offset
407+
return offset if return_offset else self._encode_pointer(offset)
414408
return res
415409

416410

@@ -484,8 +478,8 @@ def _enumerate_nodes(self, node):
484478
elif type(node) is SearchTreeLeaf:
485479
node_id = id(node)
486480
if node_id not in self._leaf_offset:
487-
res = self.encoder.encode(node.value)
488-
self._leaf_offset[node_id] = self._data_pointer - len(res)
481+
offset = self.encoder.encode(node.value, return_offset=True)
482+
self._leaf_offset[node_id] = offset + 16
489483
else: # == None
490484
return
491485

0 commit comments

Comments
 (0)