Skip to content

Conversation

bertsky
Copy link
Collaborator

@bertsky bertsky commented May 9, 2020

This is a major rework of Ocropy's rule-based segmentation.

It greatly improves the situation with some long-standing problems, among them…

  • DPI relativity (again)
  • robust h/v-line separator detection (again)
  • text/image segmentation (sort-of)
  • conflation of very close text lines (vertically due to ascenders/descenders, horizontally due to noise)
  • reading order
  • performance (OpenCV and PIL instead of SciPy)

…but also offers solutions to unchartered terrain (for ocropus/ocrolib, that is)…

  • page segmentation into regions (via new variant of recursive X-Y cut)
  • table recognition (but not detection!)
  • incremental annotation (ignoring but sorting existing regions, deleting existing text regions from reading order)

The OCR-D processor most affected by this is ocrd-cis-ocropy-segment (now with a usable level-of-operation=page and a new level-of-operation=table), and to a lesser extent ocrd-cis-ocropy-resegment.

For the details, see changelog of the individual commits.

Here's from the segment processor docstring:

Segment pages into regions+lines, tables into cells+lines, or regions into lines.

Open and deserialise PAGE input files and their respective images,
then iterate over the element hierarchy down to the requested level.

Depending on level-of-operation, consider existing segments:

  • if overwrite_separators=True on page level, then
    delete any SeparatorRegions,
  • if overwrite_regions=True on page level, then
    delete any top-level TextRegions (along with ReadingOrder),
  • if overwrite_regions=True on table level, then
    delete any TextRegions in TableRegions (along with their OrderGroup),
  • if overwrite_lines=True on region level, then
    delete any TextLines in TextRegions.

Next, get each element image according to the layout annotation (from
the alternative image of the page/region, or by cropping via coordinates
into the higher-level image) in binarized form, and represent it as an array
with non-text regions and (remaining) text neighbours suppressed.

Then compute a text line segmentation for that array (as a label mask).
When level-of-operation is page or table, this also entails
detecting

  • up to maximages large foreground images,
  • up to maxseps foreground h/v-line separators and
  • up to maxcolseps background column separators
    before text line segmentation itself, as well as aggregating text lines
    to text regions afterwards.

Text regions are detected via a hybrid variant recursive X-Y cut algorithm
(RXYC): RXYC partitions the binarized image in top-down manner by detecting
horizontal or vertical gaps. This implementation uses the bottom-up text line
segmentation to guide the search, and also uses both pre-existing and newly
detected separators to alternatively partition the respective boxes into
non-rectangular parts.

During line segmentation, suppress the foreground of all previously annotated
regions (of any kind) and lines, except if just removed due to overwrite.
During region aggregation however, combine the existing separators with the
new-found separators to guide the column search.

All detected segments (both text line and text region) are sorted according
to their reading order (assuming a top-to-bottom, left-to-right ordering).
When level-of-operation is page, prefer vertical (column-first)
succession of regions. When it is table, prefer horizontal (row-first)
succession of cells.

Then for each resulting segment label, convert its background mask into
polygon outlines by finding the outer contours consistent with the element's
polygon outline. Annotate the result by adding it as a new TextLine/TextRegion:

  • If level-of-operation is region, then append the new lines to the
    parent region.
  • If it is table, then append the new lines to their respective regions,
    and append the new regions to the parent table.
    (Also, create an OrderedGroup for it as the parent's RegionRef.)
  • If it is page, then append the new lines to their respective regions,
    and append the new regions to the page.
    (Also, create an OrderedGroup for it in the ReadingOrder.)

Produce a new output file by serialising the resulting hierarchy.

(Example images following shortly.)

Robert Sachunsky added 26 commits May 9, 2020 22:00
(instead of doing ad-hoc binarization, which is either redundant
 and thus wastes time, or may be a suboptimal workflow choice)
(by using Shapely instead of CV2 to simplify polygons)
If level-of-operation=region, then recurse into
text regions within table regions, finding text
lines.
If a table does not have any text regions yet,
then add one pseudo-block to it which covers
the whole table (and recurse into that, but in
fullpage mode to also detect h/v-lines and white
space columns).
- morph.all_neighbors:
  - fix return value (keep pairs)
  - add kwarg bg to skip
  - use true shift instead of roll,
    and fill with bg
  - add kwarg dist for arbitrary distance
- psegutils.compute_boxmap:
  break loop earlier (faster)
- make fg h/v-line detection more robust to
  non-contiguous or slightly skewed/bent shapes,
  as well as overlapping/touching glyphs:
  - compute_separators_morph: better algorithm
    based on a sequence of vertical/horizontal
    open and close operations, combined with
    binary reconstruction/seedfill
  - remove_hlines: deprecate
  - compute_hlines: new implementation analoguous
    to compute_separators_morph
  - return slightly dilated (enlarged) masks
    to facilitate annotating polygon contours
    (besides immediately suppressing foreground)
- improve bg column separator detection:
  - vertically dilate gradient edges before
    (not after) combining with thresholded whitespace
  - run after (not before) removing v-lines
- call scale estimation with zoom parameter
  (making this central measure itself become
   DPI-relative)
- revise earlier doubling of estimated scale:
  now only necessary for compute_line_seeds
  (but not for separator and gradmap estimation);
- compute_line_seeds:
  - use vscale=2
  - respect colseps early on
- hmerge_line_seeds: more robust
  based on morphology and consistency of center point
  (y center inside other's bbox, but
   x center not inside other's bbox),
  not only global overlap counts
- rename compute_line_labels back to compute_segmentation
- also improve spreading line seeds to background:
  - watch fg components: split seed conflicts,
    but keep others on their majority side
  - simplify (much faster)
- lines2regions: new implementation:
  instead of bottom-up bbox matching/merging rules,
  combine adjacent lines by morphologically closing
  while splitting at fg h/v-line and bg colseps

- improve documentation
- use type-check decorators (as in ocrolib proper)
- uncomment all DSAVE statements,
  but disable function via decorator
  (as single place to re-enable all
   file-based or interactive plots)
- introduce/pass new ocrd-tool parameters
  - hlminwidth (minimum length of h-lines,
    in multiples of scale)
  - csminheight (minimum length of v-lines,
    in multiples of scale)
- after page segmentation, also add
  detected text lines below text regions
- do not give up when a contour
  retains too small a share of the background
  (only foreground is relevant for threshold)
- common.compute_segmentation:
  - read binary image instead of grayscale-normalized
  - pass in external separator mask to be combined
    with detected separators
  - move sanity checks to common.check_*
- segment:
  - aggregate all previously existing regions,
    including text regions/lines (except when
    removing them anyway via `overwrite`),
    and suppress them in the foreground
    while doing line segmentation; moreover,
    pass any separators among them extra
    (to guide region segmentation)
  - when `level-of-operation=page`, after
    suppressing existing tables, iterate
    through them and if they do not contain
    text regions (cells) yet, segment them
    into regions and lines likewise
- README: update
- common.compute_segmentation with fullpage
  (for segment on page level): prevent sepmask
  (from h/v-lines and colseps) from being filled
  when spreading line seeds into background by
  provisionally attaching a label for it and
  spreading it against the other line labels
- common.compute_line_seeds: skip aggregating
  height statistics for warnings of large lines
  (too slow)
- when adding text regions to pages or tables,
  also add to (recursive) ReadingOrder in the
  order of region labels;
- when indexed, start from existing elements;
- when adding cells to a table, convert
  RegionRef(Indexed) to an equally located
  OrderedGroup(Indexed)
- instead of silently segmenting existing tables
  when running on page level, now simply ignore tables
  (like any other non-text regions), but offer a
  dedicated table level, which ignores text regions
  (like any other non-table regions), and only
  segments tables without existing cells
- add overwrite_separators=True on page level
  (for finding separators via ocrolib instead
   of ignoring existing separator regions)
- uniform_filter based dilate/erode/open/close:
  replace exact zero with approx infinitesimal
  to avoid artifacts from rounding
- all: correct pixel origin depending on even/odd
  filter size to avoid asymmetric results as best
  as possible (even kernel sizes of course will
  still cause asymmetry, but to a lesser extent)
- morph.reading_order:
  instead of providing only plain y.start top-down
  ordering, add this combined strategy doing both
  - y.center top-down but also
  - y.overlaps x.non-overlaps left-right
  and also offer the reverse (bottom-up, right-left)
- sl: add missing functions:
  - compose: slice the slice!
  - xcenter_in / ycenter_in
  - top / bottom / left / right
- compute_images:
  new function to detect and suppress large foreground
  objects early on that are not h/v-lines
  (and _not_ search for h/v-lines or assign text lines
   within them);
  this could be true graphics/photos/figures, but also
  drop-capitals
- compute_hlines / compute_separators_morph:
  reconstruct after opening by length not by keeping
  any overlapping component, but only up to a certain
  distance (to avoid overlapping glyphs but still get
  most of the line's parts);
  ignore parts that already belong to image components
  identified by compute_images (so they don't compete
  in n-best race)
- compute_colseps_conv:
  - don't blur away small/protruding glyphs below fg/bg threshold
  - don't use cleaned but raw boxmap (again) to avoid
    marking small fg as bg
- compute_segmentation:
  filter out line labels that only have noise fg (i.e.
  components that have been filtered as too small/large)
- ensure odd kernel sizes everywhere
- DSAVE (visualization for debugging):
  use uniformly bright and maximally differentiating
  colormap, and set off 0 (background) to black;
  allow passing a second array with foreground as white
- lines2regions: instead of bbox consistency heuristics,
  implement a hybrid recursive X-Y cut segmentation, which
  not only considers horizontal/vertical gaps in foreground
  (discounting noise pixels), but also avoids splitting
  line labels, and uses the detected/pass-in separators to
  alternatively cut at non-rectangular partitions instead of
  horizontal/vertical gaps
- sort all line labels, gap-based slices and separator-based
  partitions via proper (top-down left-to-right, or reversed)
  reading order
- add an ImageRegion for each image found by compute_segmentation
- follow-up on 4bb1ddb (incremental annotation):
  during line segmentation, merely suppress neighbouring/other
  existing segments, but during region segmentation, pass
  separators as sepmask but other regions as pseudo-line labels
  to be identified within reading order;
  afterwards, re-identify them (to avoid adding new elements,
  but still reference them accordingly in the reading order group)
- follow-up on 7748ca4 (add reading order):
  - reference each region in the ReadingOrder, increasing
    @index when in an OrderedGroup(Indexed) (on Page
    or in TableRegion)
  - for TableRegions, replace existing RegionRef(Indexed)
    by an equally indexed OrderedGroup(Indexed) to hold
    all the cell regions
  - do the right thing in ReadingOrder even when
    `overwrite_regions=True`
- for all pre-existing and new-found separators and images,
  create a derived image where they are suppressed (white)
- write that image (with `clipped` in @comments) to the
  second output file group (or `OCR-D-IMG-CLIP`)
@bertsky
Copy link
Collaborator Author

bertsky commented May 9, 2020

Also fixes #41 (and a still unreported regression from 48a89e9), both in recognize.

@lgtm-com
Copy link

lgtm-com bot commented May 9, 2020

This pull request introduces 10 alerts and fixes 2 when merging f242984 into 48a89e9 - view on LGTM.com

new alerts:

  • 5 for Unused local variable
  • 2 for Unused import
  • 1 for 'import *' may pollute namespace
  • 1 for Variable defined multiple times
  • 1 for Nested loops with same variable

fixed alerts:

  • 2 for Except block handles 'BaseException'

@bertsky
Copy link
Collaborator Author

bertsky commented May 9, 2020

Example A

  1. original (provided by @wrznr) filemax00005
  2. binarization with ocrd-olena-binarize (sauvola-ms-split / k=0.1)
    5SAUVOLA-HEAVY_0001-BIN_sauvola-ms-split
  3. large component, non-separator ("image") detection
    tmpa4_iazdoimages6_dilated
  4. h-line separator detection
    tmpy15870auhlines6_v-dilated
  5. v-line separator detection
    tmpi9ld6wtrcolseps6_h-dilated
  6. column separator detection, background threshold
    tmpolj9j_cacolwsseps1_thresh
  7. column separator detection, horizontal gradient map
    tmp0nk650m9colwsseps2_grad-raw
  8. column separator detection, combined (final colwsseps)
    tmp4ohh_v81colseps
  9. all separators/images combined
    tmpwpb_8fn4sepmask
  10. textline detection, vertical gradient map
    tmp1vyuljeegradmap
  11. textline detection, bottom/top marks and line seed
    tmpkx_eacbglineseeds+bmarked+tmarked
  12. textline detection, filtered line labels
    tmpq81sec5mlineseeds_filtered
  13. textline detection, spreading from fg into bg, but also against separators
    tmptyhfmo8alineseeds_spread
  14. textline detection, final line labels
    tmpm7rmbd4wllabels
  15. recursive X-Y cut, full-size box with x/y profiles at the margins
    tmp0po8pvs8recursive_x_y_cut
  16. recursive X-Y cut, full-size box with all gap candidates (un/prominent, dis/allowed)
    tmpnjvlfzzfrecursive_x_y_cut_gaps_h
  17. recursive X-Y cut, first vertical slice with all gaps
    tmpy4qaiioqrecursive_x_y_cut_gaps_h
  18. recursive X-Y cut, second vertical slice with all gaps
    tmpvy0ut9gnrecursive_x_y_cut_gaps_h
  19. recursive X-Y cut, second vertical slice, second horizontal slice
    tmp40m8w7f8recursive_x_y_cut
  20. recursive X-Y cut, second vertical slice, second horizontal slice
    tmp_5uktw6qrecursive_x_y_cut
  21. recursive X-Y cut, second vertical slice, second horizontal slice
    tmpyh94vy6wrecursive_x_y_cut_gaps_h
  22. recursive X-Y cut, second vertical slice, second horizontal slice
    tmphiqfr0vorecursive_x_y_cut_gaps_h
  23. recursive X-Y cut, second vertical slice, second horizontal slice
    tmp34ii5q3krecursive_x_y_cut_gaps_h
  24. recursive X-Y cut, second vertical slice, second horizontal slice (no more cuts)
    tmpgvxbbw28recursive_x_y_cut
  25. recursive X-Y cut, third vertical slice with all gaps
    tmp2nua2qi7recursive_x_y_cut_gaps_h
  26. recursive X-Y cut, third vertical slice, first vertical slice with all gaps
    tmpy66u2ua5recursive_x_y_cut_gaps_h
  27. recursive X-Y cut, third vertical slice, first vertical slice with 3 vertical partitions
    tmpdkvjmbrzrecursive_x_y_cut_partitions
  28. recursive X-Y cut, third vertical slice, first vertical slice, middle partition with all gaps
    tmp13nlaa5urecursive_x_y_cut_gaps_v
  29. recursive X-Y cut, third vertical slice, first vertical slice, middle partition, second vertical slice with all gaps
    tmpeeqz9m3zrecursive_x_y_cut_gaps_h
  30. recursive X-Y cut, third vertical slice, first vertical slice, middle partition, second vertical slice, second vertical slice with vertical gaps
    tmpnnd6sy5qrecursive_x_y_cut_gaps_v
    note: we don't choose the most prominent gaps here, because they would produce partitions that sum up to less total height than the partitions created by the less prominent blue gap (we value height more than width because we are segmenting a page and not a table)
  31. recursive X-Y cut, third vertical slice, first vertical slice, middle partition, second vertical slice, second vertical slice, first vertical slice with all gaps
    tmp0leqgzzlrecursive_x_y_cut_gaps_h
  32. recursive X-Y cut, third vertical slice, first vertical slice, middle partition, second vertical slice, second vertical slice, first vertical slice with 3 vertical partitions
    tmp7pe5owrbrecursive_x_y_cut_partitions
  33. recursive X-Y cut, third vertical slice, first vertical slice, middle partition, second vertical slice, second vertical slice, first vertical slice, left partition with all gaps
    tmp0sg_n01erecursive_x_y_cut_gaps_v
  34. recursive X-Y cut, third vertical slice, first vertical slice, middle partition, second vertical slice, second vertical slice, first vertical slice, left partition, all 5 slices
    tmpq_irfv4hrecursive_x_y_cut
    tmpnjcy3zihrecursive_x_y_cut
    tmpp7oeyuvnrecursive_x_y_cut
    tmpfq298d8zrecursive_x_y_cut
    tmpafeaxgf4recursive_x_y_cut
    tmp1ctnqc_8recursive_x_y_cut
  35. recursive X-Y cut, third vertical slice, first vertical slice, middle partition, second vertical slice, second vertical slice, second vertical slice with all gaps
    tmpxjfkmlqfrecursive_x_y_cut_gaps_h
  36. recursive X-Y cut, third vertical slice, first vertical slice, middle partition, second vertical slice, second vertical slice, second vertical slice, third horizontal slice with all gaps
    tmp2gi17q6orecursive_x_y_cut_gaps_h
  37. recursive X-Y cut, final regions with contours around lines
    tmphahhcmlzrlabels_closed
  38. final result in PageViewer with reading order and quite a few pseudo-regions due to binarization noise
    5OCROREGIONSXYMASK-again4_0001

@bertsky
Copy link
Collaborator Author

bertsky commented May 9, 2020

Example B

  1. original (provided by @jbarth-ubhd, downsampled from 1200 to 300 DPI for Github)
    02_-arndt1710-_000_096
  2. large component, non-separator ("image") detection
    tmpuq09ekgximages6_dilated
  3. h-line separator detection
    tmp7l3i3cikhlines6_v-dilated
  4. v-line separator detection
    tmp8dubj8pkcolseps6_h-dilated
  5. column separator detection, combined (final colwsseps)
    tmpxxdkbk6kcolseps
  6. textline detection, filtered line labels
    tmp8dtaxn_5lineseeds_filtered
  7. textline detection, final line labels
    tmpr_f3xjoullabels
  8. recursive X-Y cut, full-size box with x/y profiles at the margins
    tmppf9i4zqyrecursive_x_y_cut
  9. recursive X-Y cut, full-size box with 2 vertical partitions
    tmp2_17lrfsrecursive_x_y_cut_partitions
  10. recursive X-Y cut, left partition with all gaps
    tmpvg9q6s96recursive_x_y_cut_gaps_v
  11. recursive X-Y cut, left partition, first vertical slice (no more cuts possible)
    tmpr7qucclmrecursive_x_y_cut
  12. recursive X-Y cut, left partition, second vertical slice (no more cuts possible)
    tmpf1767sjvrecursive_x_y_cut
  13. recursive X-Y cut, right partition with all gaps
    tmpeevzc_4brecursive_x_y_cut_gaps_v
  14. recursive X-Y cut, final regions with contours around lines
    tmps160sd1urlabels_closed
  15. final result in PageViewer with reading order and benign under-segmentation
    TEMP4_0001

@bertsky
Copy link
Collaborator Author

bertsky commented May 9, 2020

Example C

  1. original (artificial, digital-born)
    Gutachten2-2 0
  2. Tesseract segmentation in PageViewer (malign under-segmentation in paragraphs and cells, and textlines crossing separators, table detection with sub-optimal boundaries)
    OCR-D-SEG-LINE-tesseract-CLIP_Gutachten2-2
    note: We'll throw away separators (overwrite_separators) and text regions (overwrite_regions), keeping only table regions, because we have no table detection of our own yet!
  3. h-line separator detection (cutting off overlapping glyph)
    tmphpp3xzodhlines6_v-dilated
  4. v-line separator detection (neglecting segments which are only 1 line in height)
    tmptt5ycg5kcolseps6_h-dilated
  5. column separator detection, combined (final colwsseps)
    tmpsq13xvs7colwsseps3_seps
    Note: we have suppressed the table regions that we want to keep during page segmentation (just not for h/v-line detection)
  6. all separators combined (including ignored regions)
    tmpsuc_pyt3sepmask
  7. textline detection, filtered line labels
    tmpkrx6uj7ylineseeds_filtered
  8. textline detection, spreading from fg into bg, but also against separators
    tmpcfc_y_1mlineseeds_spread
  9. textline detection, final line labels
    tmphtdep9tdllabels
  10. recursive X-Y cut, full-size box with x/y profiles at the margins
    tmp9n0_yad_recursive_x_y_cut
  11. recursive X-Y cut, full-size box with 5 vertical partitions
    tmpvl_yr8fhrecursive_x_y_cut_partitions
    Note: the upper table's separators bleed into the overall background, so they don't yield separate partitions at this iteration; the lower 2 tables are isolated enough to create true partitions, but since they also contain large existing table regions, a number of would-be partitions get fused together
  12. recursive X-Y cut, upper/largest partition with all gaps
    tmp0sly729zrecursive_x_y_cut_gaps_v
  13. recursive X-Y cut, upper/largest partition, first vertical slice with x/y profiles at the margins
    tmpds45sd62recursive_x_y_cut
    Note: we are not allowed to cut through the existing table region segment now, but the heading is too close for a cut between text and table
  14. recursive X-Y cut, upper/largest partition, first vertical slice with 2 partitions
    tmp44jyhvfjrecursive_x_y_cut_partitions
    Note: within this slice, we at least get to partition the heading against the table (but they would be split afterwards even if they had been kept in one region label)
  15. recursive X-Y cut, upper/largest partition, second vertical slice with all gaps
    tmp0ll8v3x6recursive_x_y_cut_gaps_v
  16. recursive X-Y cut, upper/largest partition, third vertical slice, second vertical slice, first horizontal slice with all gaps
    tmpzrjz9cozrecursive_x_y_cut_gaps_v
  17. recursive X-Y cut, upper/largest partition, third vertical slice, second vertical slice, second horizontal slice with all gaps
    tmpj06hpihvrecursive_x_y_cut_gaps_h
  18. recursive X-Y cut, third partition with x/y profiles at the margins
    tmp5vd2rwbmrecursive_x_y_cut_masked
    Note: again, we are not allowed to cut within the existing table region, and the adjacent text region is too close for a cut
  19. recursive X-Y cut, final regions with line contours
    tmpf2lab0j7rlabels
  20. recursive X-Y cut, final regions with contours around lines
    tmpviyamrmarlabels_closed
  21. page segmentation result in PageViewer with reading order including tables (but still without recursive structure)
    TEMP4_Gutachten2-2
    Note: now we can enter level-of-operation=table for the 3 table instances
  22. all separators combined (fg lines existing from page segmentation and bg colseps detected here)
    tmpvhe6t6zqsepmask
  23. textline detection, filtered line labels
    tmpsokljrzslineseeds_filtered
  24. textline detection, spreading from fg into bg, but also against separators
    tmp4dg_era9lineseeds_spread
  25. recursive X-Y cut, full-size box with x/y profiles at the margins
    tmp4v3bc7sprecursive_x_y_cut
  26. recursive X-Y cut, full-size box with 8 partitions
    tmpd9oj1wvtrecursive_x_y_cut_partitions
  27. recursive X-Y cut, final regions with contours around lines
    tmpjwp3mvrirlabels_closed
  28. final result in PageViewer with reading order and recursive table structure
    TEMP5_Gutachten2-2

@bertsky
Copy link
Collaborator Author

bertsky commented May 9, 2020

One word about performance: resegment and region-level segment are much faster than before, but page/table-level segment is slower than before because recursive X-Y cut (despite not using back-tracking) takes its toll. However, a 300 DPI page still should not take more than 30s. (Images with high pixel density do not get downsampled yet, so runtime will probably increase quadratically. There is a lot of head-room for further optimizations, e.g. not repeating component analysis unnecessarily every other line.)

I should also mention that there are quite a few parameters to control page segmentation. I have no idea how general my defaults are though. If you get strange results, look at the number and length of lines to be detected, number of images to be detected, and especially gap_width and gap_height. (Plus I recommend Sylwester&Seth 1996: A Trainable, Single-Pass Algorithm for Column Segmentation for ideas how to optimise these from GT data.)

Or activate the DSAVE function to visualise intermediate results as shown above (either interactively via pyplot.show() or as files via pyplot.imsave():

@disabled()
def DSAVE(title,array, interactive=False):

@lgtm-com
Copy link

lgtm-com bot commented May 10, 2020

This pull request introduces 6 alerts and fixes 3 when merging 907c00f into 48a89e9 - view on LGTM.com

new alerts:

  • 3 for Unused local variable
  • 1 for 'import *' may pollute namespace
  • 1 for Variable defined multiple times
  • 1 for Nested loops with same variable

fixed alerts:

  • 2 for Except block handles 'BaseException'
  • 1 for 'import *' may pollute namespace

@bertsky bertsky force-pushed the segment-table-lines branch from 907c00f to 32786a6 Compare May 10, 2020 19:02
@lgtm-com
Copy link

lgtm-com bot commented May 10, 2020

This pull request introduces 1 alert and fixes 3 when merging 32786a6 into 48a89e9 - view on LGTM.com

new alerts:

  • 1 for 'import *' may pollute namespace

fixed alerts:

  • 2 for Except block handles 'BaseException'
  • 1 for 'import *' may pollute namespace

@lgtm-com
Copy link

lgtm-com bot commented May 10, 2020

This pull request introduces 1 alert and fixes 3 when merging b505d65 into 48a89e9 - view on LGTM.com

new alerts:

  • 1 for 'import *' may pollute namespace

fixed alerts:

  • 2 for Except block handles 'BaseException'
  • 1 for 'import *' may pollute namespace

@bertsky bertsky requested a review from finkf May 11, 2020 10:00
@bertsky bertsky linked an issue May 11, 2020 that may be closed by this pull request
@bertsky
Copy link
Collaborator Author

bertsky commented May 11, 2020

Thanks @finkf for the invitation!

Cannot invite non-collaborators for a review directly, but if @kba or @wrznr would care to give it a try, that would be awesome.

Copy link
Contributor

@finkf finkf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am OK with it.

@finkf finkf merged commit fe129fe into cisocrgroup:dev May 11, 2020
@bertsky bertsky mentioned this pull request Jan 12, 2021
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ocrd-cis-ocropy-recognize: 'ascii' codec can't decode byte 0xa9

2 participants