1. Hello!

    First of all, welcome to MapleLegends! You are currently viewing the forums as a guest, so you can only view the first post of every topic. We highly recommend registering so you can be part of our community.

    By registering to our forums you can introduce yourself and make your first friends, talk in the shoutbox, contribute, and much more!

    This process only takes a few minutes and you can always decide to lurk even after!

    - MapleLegends Administration-
  2. Experiencing disconnecting after inserting your login info? Make sure you are on the latest MapleLegends version. The current latest version is found by clicking here.
    Dismiss Notice

An Improved Transcriber for OwlRepo

Discussion in 'Programming' started by Kirisame, Nov 8, 2024.

Thread Status:
You must be a logged-in, registered member of this site to view further posts in this thread.
  1. Kirisame
    Offline

    Kirisame Capt. Latanica

    347
    122
    278
    Jul 5, 2018
    4:31 AM
    Kirisame Marisa
    Magician
    200
    Background

    OwlRepo[1][2] is a popular website created by geospizageospiza for tracking and analyzing FM data.
    Users only need to upload owl screenshots to update the database, which makes it very easy to use.

    However, it is not perfect at transcribing data. So I tried to improve it.

    Problems

    Problem 1: Column Misalignment
    Sometimes when the store name is too simple, some columns in the transcription results are moved to the next column, resulting in invalid data.
    See xQuin's item in the following example.
    owl1.png
    https://owlrepo.com/listing/72f52fa9-5f7b-47f6-849c-1fb6681c76b1
    owlrepo1.png

    Problem 2: Overlapping Digits
    When the price of an item is over a billion, the price becomes too long and the billion digit overlaps with the “bundle”. This makes it difficult for the OCR software to recognize the two numbers.
    See Reisen's and Luke's items in the following example:
    owl2.png
    https://owlrepo.com/listing/57e12a39-da99-41b1-94be-d0fc8f4b403a
    owlrepo2.png

    Solution
    The solution consists of four steps:
    1. Make a virtual grid to eliminate column misalignment.
    2. Filter pixels by color to extract item names, counts, and overlapping digits.
    3. Fix the shape of overlapping digits, only 3 cases total (1|1, 1|2, 2|1).
    4. Run OCR on each cell to get the results.
    Demo
    demo.gif
    And here is the transcribed result:
    Code:
    "Maple Shield"
    count: 34
    ['Homofil', '1-6 top', '1', '28,999,999', '1']
    ['FruitBish', '1-13 mid', '1', '99,999,999', '1']
    ['Nixar', '1-2 IntEq,', '1', '500,000,000', '1']
    ['Nixar', '1-2 IntEq,', '1', '415,000,000', '1']
    ['iWaifu', '4-2 Drea.', '1', '42,999,999', '1']
    ['Reisen', 'qwertyuio.', '1', '2,147,483,647', '1']
    ['Luke', '6-7 ##door', '1', '1,300,000,000', '1']
    ['Nixar', '1-2 IntEq,', '1', '200,000,000', '1']
    
    Code
    I've created a jupyter notebook on Google Colab for you to play with:

    https://colab.research.google.com/drive/1fWxdD3xaZ-o-lR8nkoRWZNjbU312iQoC?usp=sharing

    It needs to be integrated with projects like OwlRepo to be usable by end users.
    Feel free to modify or use this code in your projects.
    And let me know if you find any ways to improve it.
     
    Last edited: Nov 8, 2024
    • Great Work Great Work x 11
Thread Status:
You must be a logged-in, registered member of this site to view further posts in this thread.

Share This Page