1. Hello!

    First of all, welcome to MapleLegends! You are currently viewing the forums as a guest, so you can only view the first post of every topic. We highly recommend registering so you can be part of our community.

    By registering to our forums you can introduce yourself and make your first friends, talk in the shoutbox, contribute, and much more!

    This process only takes a few minutes and you can always decide to lurk even after!

    - MapleLegends Administration-
  2. Experiencing crashes with 'Unknown Error'? Make sure you are on the latest MapleLegends version. The current latest version is found by clicking here.
    Dismiss Notice

OwlRepo - a repository of transcribed owl searches

Discussion in 'Items & Mesos' started by geospiza, Jul 3, 2020.

Thread Status:
You must be a logged-in, registered member of this site to view further posts in this thread.
  1. geospiza

    geospiza Web Developer Staff Member Web Developer

    Apr 16, 2020
    3:39 AM
    Dark Knight
    Hey all, I'd like to share with all of you a site I built for fun and learning. OwlRepo is a site that machine transcribes owl searches and summarizes them. The best part is that anyone can help out by uploading screenshots from in-game.


    The front page contains a searchable summary of every upload to the site. This is subject to change while I figure out the best way to present information about current prices. I usually set prices in my stores around the p25 value, which is the 25th percentile or the price that is greater than 25% of the items on the market at that time.

    If you click on an item, it will take you to the history of owl screenshots. You can get a sense for the change in the market by looking at the plot over time.


    If you click on one of the items on this page, you will see the complete contents of the machine transcription. You can copy and paste the results into a spreadsheet for your own purposes. For a quick example of what's currently possible, take a look of these owl searches for all of the cape scrolls for base stats.


    The centerpiece of the site is the ability to upload your own owl searches. I've made uploading as streamlined as possible. No account necessary, and no more of the screenshot than necessary to transcribe the owl search. Try it out :)

    I use the Owl of Minerva to search for all of these items, and take an in-game screenshot of each page.


    I then use the Upload tab from the screenshots in the MapleLegendsHD folder.


    A minute later, the transcribed data is available in the index.


    In case you're interested in some of the details of how this thing was built:

    The heart of this is an OCR engine called tesseract. I put together a small library that extracts the data from the screenshots into a machine readable format. Each batch of owl searches are put into JSON file that can be used in other applications. The web service exposes this as part of it's API:


    Since I have not open sourced the server code at the time of writing, here's the schema for the response.

      "type": "object",
      "properties": {
        "screenshot": {
          "type": "object",
          "properties": {
            "timestamp": { "type": "string", "format": "date-time" },
            "name": { "type": "string" },
            "sha1": { "type": "string", "format": "uri" }
          "required": ["timestamp", "name", "sha1"]
        "search": {
          "type": "object",
          "properties": {
            "text": { "type": "string" },
            "item": { "type": "string" },
            "results": { "type": "integer" }
          "required": ["text", "item", "results"]
        "paginator": {
          "type": "object",
          "properties": {
            "text": { "type": "string" },
            "current": { "type": "integer" },
            "total": { "type": "integer" }
          "required": ["text", "current", "total"]
        "body": {
          "type": "object",
          "properties": {
            "text": { "type": "string" },
            "entries": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "id": { "type": "string" },
                  "store_name": { "type": "string" },
                  "bundle": { "type": "integer", "minimum": 1, "maximum": 200 },
                  "price": { "type": "integer", "minimum": 1 },
                  "quantity": { "type": "integer", "minimum": 1 }
                "required": ["id", "store_name", "bundle", "price", "quantity"]
          "required": ["body", "entries"]
      "required": ["screenshot", "search", "paginator", "body"]

    When I'm ready to ship changes to the back-end, a new Docker image is built and deployed to various services. The site is hosted on Google Cloud Platform and takes advantage of their free tier.


    It should scale up and accept millions of owl searches a month well. Of course, I'm the only user at the moment with maybe hundreds of pages a day.

    The data gets inserted into a BigQuery table, which can be used to generate tables and charts about everything that is uploaded. Here's the SQL that's used to generate the current chart pictured above. There are more sophisticated queries that can be used to handle bad actors.

      -- convert the nested JSON structure into a row per seller
      flattened AS (
        payload.screenshot.timestamp AS screenshot_timestamp,
        payload.screenshot.sha1 AS screenshot_sha1,
        payload.search.item AS search_item,
        payload.search.results AS search_results,
        REPLACE(REPLACE(paginator.text, "(", ""), ")", "") AS paginator_text,
        owl.upload_v1 AS upload,
        UNNEST(upload.payload) payload,
        UNNEST(body.entries) entries
        AS entry_index ),
      extracted AS (
        DATE(screenshot_timestamp) AS screenshot_date,
        flattened )
      -- generate statistics over time/search result window
      DISTINCT * EXCEPT(price),
      COUNT(*) OVER (PARTITION BY task_id, screenshot_date, search_item, search_results) AS search_results_captured,
        0.5) OVER (PARTITION BY task_id, screenshot_date, search_item, search_results) AS median_price
      search_results_captured DESC

    The back-end is otherwise a dumb pipe that just transcribes the owls.

    The site itself handles displaying the machine-read files in a human friendly way. It's written in Svelte, which has been enjoyable to work with so far. If you're curious, the source can be inspected in the browser development tools. I'm no web developer though, so it can look ugly.

    Old individual listing page screenshot for cape scrolls.

    This dashboard is no longer, you did well Data Studio.

    Updated 2020-12-01
    • Updated out-of-date screenshots into an appendix section
    • Quoted my closing thoughts to keep context for discussion, but the information is mostly out of date.

    Attached Files:

    • Great Work Great Work x 28
    • Like Like x 11
    • Agree Agree x 1
    • Useful Useful x 1
Thread Status:
You must be a logged-in, registered member of this site to view further posts in this thread.

Share This Page