1. Hello!

    First of all, welcome to MapleLegends! You are currently viewing the forums as a guest, so you can only view the first post of every topic. We highly recommend registering so you can be part of our community.

    By registering to our forums you can introduce yourself and make your first friends, talk in the shoutbox, contribute, and much more!

    This process only takes a few minutes and you can always decide to lurk even after!

    - MapleLegends Administration-
  2. Experiencing disconnecting after inserting your login info? Make sure you are on the latest MapleLegends version. The current latest version is found by clicking here.
    Dismiss Notice

Exploring player growth via rankings page

Discussion in 'General Discussion' started by geospiza, May 8, 2021.

  1. geospiza
    Offline

    geospiza Web Developer Staff Member Web Developer

    212
    449
    215
    Apr 16, 2020
    12:15 AM
    geospiza
    Dark Knight
    146
    Funk
    I've had an idea for a project brewing in my head for a while. I thought I'd go the route of writing it out before diving into it, and have a little development journal/log to go along with it. I'm also fishing for feedback and ideas from other curious folk here.

    I made this data visualization a few months back of level 200 characters, which summarizes the leveling for level 200 characters using the publicly available data on the website. I'd like to expand on this. I'm curious about the history of the population, and the trajectory of the average player on the server from different eras. Here are the types of questions that I'd like to answer with the data that I plan on scraping from the website (mostly a brain dump):
    • How often does the ranking change at certain positions over the course of several months?
    • What was the approximate size of the server over its history?
    • Given the history of a player, what is the estimated time until their next level?
    • Are players who started in 2020 more or less likely to stick around than players who started in 2019?
    • In what ways did the 2020 summer balance patch affect the buccaneer/paladin populations?
    • Has the most popular class changed over time?
    • What is the effect of events on the frequency of levels?
    • What is the distribution of player levels today? Last year? Two years ago?
    • What is the cumulative experience gained by everyone, ever?
    As a disclaimer since I am web developer -- I do not have direct access to player data. I plan on building something with publicly available data. The constraints make it more interesting anyways IMO.

    With 192k players in the ranking at the time of writing, it wouldn't be polite to scrape this continuously. I plan on a budget of 2000 scrapes per week or ~1% sample of data. This should take about <10 minutes at no more than 10 requests a second.

    There are many ways to sample data, but this is what I have in plan:
    • Stratified sampling by class and level. Balanced data makes it easier to compare sub-populations. For each class, determine the prior distribution of levels. Bucket the distribution in a way where each bucket has the same number of pages from the rankings page, and scrape positions and levels from the page representing those buckets.
    • Fixed characters in the ranking page. Take an initial sample of characters, and track their positions over time. How much do the rankings move?
    Once I get this data (with timestamps), I'll dump it in a bucket as a sqlite/csv file and make it available for analysis. I'll probably make a website too once I figure out what it should look like.

    Hopefully this will be interesting!
     
    • Great Work Great Work x 15
    • Like Like x 5
    • Informative Informative x 1
  2. OP
    OP
    geospiza
    Offline

    geospiza Web Developer Staff Member Web Developer

    212
    449
    215
    Apr 16, 2020
    12:15 AM
    geospiza
    Dark Knight
    146
    Funk
    I ended up writing new code to scrape the rankings page, which can be found in this notebook: https://github.com/geospiza-fortis/...notebooks/2021-05-08-level-distribution.ipynb

    The session didn't go exactly to plan. I scraped about 45 pages of data to get a get a sense of how levels relate to rank. 20 of them i scraped choosing linearly spaced pages, 25 of them spaced geometrically.

    upload_2021-5-8_17-18-31.png

    As expected, the majority of accounts are lower level. Maybe quarter of characters have reached 2nd job and an eighth to 3rd job from eyeballing the chart. This looks roughly exponential/logarithmic, so I fit the data to an exponential curve using scipy.optimize.curve_fit.

    upload_2021-5-8_17-43-13.png

    Not sure how to determine if this actually fits well. I read up on the chi-square goodness of fit test, but I'm not sure if it's suitable here. In any case, this doesn't work well because the tail extends out into infinity (it guesses that there are 3.6k characters at level 200). I reversed the x-axis too, but I couldn't figure how to fix the x-intercept. My stats knowledge is pretty shallow, so this is where I nope out.

    I'm giving up on trying to guesstimate the level transitions in the ranking and just going with the good-old binary search. Time complexity is O(k log(n)) where n=2000 (40000 pages divided by 20 geometrically spaced samples of known levels) and k=200. I should be able to find an answer to my question in ~2000 requests on the full ranking page.

    Scraped data can be found https://storage.googleapis.com/geospiza/ml-player-growth/initial_sample.json and looks like this:

    Code:
    [
      {
       "index": 0,
       "timestamp": "2021-05-08T20:57:50.307054",
       "category": "all",
       "rank": 1,
       "name": "Unlucky",
       "job": "magicia",
       "specialization": "cleric",
       "mastery": "bishop",
       "level": 200
      },
      {
       "index": 1,
       "timestamp": "2021-05-08T20:57:50.307054",
       "category": "all",
       "rank": 2,
       "name": "Babo",
       "job": "thief",
       "specialization": "assassin",
       "mastery": "night lord",
       "level": 200
      },
      ...
    ]
    
     
    • Great Work Great Work x 6
    • Useful Useful x 1
  3. OP
    OP
    geospiza
    Offline

    geospiza Web Developer Staff Member Web Developer

    212
    449
    215
    Apr 16, 2020
    12:15 AM
    geospiza
    Dark Knight
    146
    Funk
    Notebooks:
    And data:

    I decided to leave the question of finding out how many people are in a particular level for another time. The strategy of sampling with linearly and geometrically spaced buckets actually seems to work well for how simple the strategy is. I made a histogram with 100 of each buckets put together:

    upload_2021-5-9_20-37-35.png

    I get a good spread of the higher level people and a good spread of the lower/mid leveled players by choosing these pages in the rankings.

    upload_2021-5-9_20-38-58.png

    It does miss a few levels in the middle, but this unblocks me from the next steps. I get a list of 1000 characters for the 200 pages above. I scraped players histories to get data that looks that looks like this:

    Code:
    [{'name': 'geospiza', 'level': 142, 'timestamp': '2021-04-28T02:57:30'},
     {'name': 'geospiza', 'level': 141, 'timestamp': '2021-03-21T03:22:02'},
     {'name': 'geospiza', 'level': 140, 'timestamp': '2021-03-05T02:43:26'}]
    
    I dumped both the "ranking" and "levels" data into a sqlite database so I can write some SQL to make some nice plots. The first thing I did to validate the data was to plot the count of players who hit level 2 in a particular month.

    Code:
        select
            date(timestamp, "start of month") as month,
            count(distinct name) as n
        from levels
        where level = 2
        group by 1
    

    upload_2021-5-9_20-47-59.png

    June-July 2017 and April-May 2020 are the peaks in the plot, each with more than 35 players joining (out of 1000 in the sample). The reason for the spike in 2020 is easily explained by world events. I asked in discord about the spike in 2017; Kimmy said that this was due to Facebook advertising around that time.

    With that, the sampled data looks great. We need to keep in mind that because of the sampling methodology, there is a bias toward higher level players. However, this should be independent of when someone starts. In other words, the assumption is that the position in the rankings doesn't say anything about when you started.

    Now I want to compare people who started in 2017 to those who started in 2020. First, I wrote a query that can be used to compare players relative to their starting date. Each player has 1 row for each observed day.

    upload_2021-5-9_20-59-35.png

    It's kind of interesting to see levels flatten out in these plots. Once I had this, I just averaged the levels out of three cohorts (or groups of players by starting date). I chose 3 cohorts:
    • 2017-07 - Big growth event due to FB campaign (n=44)
    • 2020-04 - Big growth event due to covid-19 (n=43)
    • 2018-10 - The median month based on number of new players (control, n=12)
    I also wrote some ungodly SQL to get the following plot:

    Code:
        -- cohorts are people who started in the same month
        with cohort as (
            select
                name,
                min(timestamp) as starting_ts,
                date(min(timestamp), "start of month") as starting_month
            from levels
            where level = 2
            group by 1
        ),
        -- we'll only look at 90 days after people start playing
        day_range as (
                select 0 day
            union
                select day + 1
                from day_range
                where day <= 90
        ),
        -- we take a look at mid-2017, beginning of 2020, and a median date
        selected_cohort as (
            select distinct
                starting_month,
                name,
                cast(julianday(timestamp)-julianday(starting_ts) as integer) as difference,
                level
            from levels
            join cohort
            using (name)
            where starting_month in ("2017-07-01", "2020-04-01", "2018-10-01")
        ),
        -- many players will level more than once in a day, so choose the level on the day boundary
        level_at_day as (
            select
                starting_month,
                name,
                day,
                max(level) over (
                    partition by starting_month, name
                    order by level
                    rows 1 preceding
                ) as level
            from day_range
            cross join selected_cohort
            where difference <= day
        ),
        -- get the max level for each day, this one in particularly interesting for other analysis
        max_level_at_day as (
            select
                starting_month,
                name,
                day,
                max(level) as level
            from level_at_day
            group by 1, 2, 3
        )
        select
            starting_month,
            day,
            sum(level) as sum_level,
            avg(level) as avg_level
        from max_level_at_day
        group by 1, 2
        order by 1, 2
    

    upload_2021-5-9_21-10-13.png

    There's a marked difference in the average level for all three cohorts. A month after playing, characters in 2018 continue to have a lower average level. Players in 2020 have the highest level across all time periods. One potential reason for the difference in 2020 and 2017 is that the anniversary event of 2020 had an unprecedented 40% experience boost.

    I'll probably play around with some survival analysis next, but I'm also open to any requests since I do have the tools to scrape samples from any of the rankings pages (speaking of which, the guild ones might be interesting...).
     
    • Great Work Great Work x 12
    • Like Like x 1
  4. Nightz
    Offline

    Nightz Supervisor Staff Member Supervisor Game Moderator

    1,796
    1,038
    490
    Oct 22, 2020
    Male
    9:15 AM
    Nightz
    I/L Arch Mage
    200
    Funk & Pasta
    • Funny Funny x 1
  5. Slime
    Online

    Slime Pixel Artist Retired Staff

    641
    1,185
    381
    Apr 8, 2015
    Male
    Israel
    10:15 AM
    Slime / OmokTeacher
    Beginner
    102
    Flow
    Important to note that "40% experience boost" here means a 20% increase actually, because we're already x2.
    Generally the farther back you go I would expect people to level slower, because new content is always being released and is usually made to be useful for leveling, otherwise it's irrelevant.
    And also because the economy develops and you have an easier time buying stuff, for example I remember in 2016 Zhelms were like 30m-50m each and obviously no "afk service" yet.
    Awesome stuff btw keep it up!
     
    • Informative Informative x 3
    • Agree Agree x 1
  6. OP
    OP
    geospiza
    Offline

    geospiza Web Developer Staff Member Web Developer

    212
    449
    215
    Apr 16, 2020
    12:15 AM
    geospiza
    Dark Knight
    146
    Funk
    I thought the experience was 1.4x the base rate we get on the server? That multiplier is still 2-4x the amount we've seen in the events since, and it doesn't sound like it's going to come back (from what I read here on the forums). The idea that leveling is getting easier makes a ton of sense btw, it's very plausible it's the case.

    I put together another plot by year. I dropped 2016 because ranking wasn't implemented until a few months in, and dropped 2021 because it's too recent.

    upload_2021-5-10_22-49-8.png

    I expected to see the years ladder up with 2016 at the bottom and 2020 at top. It doesn't seem to be the case with 2018 being the second lowest average. I made a boxplot at the 90 day mark to see what the spread looks like:

    upload_2021-5-10_22-52-20.png

    I don't see any clear trends here, so I'm guessing the average level might not be the best way to summarize things.

    As a bonus, I also made some art by plotting all of the leveling curves on the same plot.

    upload_2021-5-10_22-55-3.png
     
    • Great Work Great Work x 4
  7. Jafel
    Offline

    Jafel Capt. Latanica

    345
    168
    278
    May 3, 2015
    Male
    9:15 AM
    KingSlime
    Outlaw
    70
    Twice
    The art looks like my drawings with crayon in elementary school. <3

    You could plot the medians as well compared to the means to see whether that follows are more expected pattern. As means are sensitive to extreme values.
     
  8. fartsy
    Offline

    fartsy Zakum

    1,337
    790
    471
    Jun 29, 2017
    Male
    2:15 AM
    Fartsy
    F/P Wizard
    Pasta
  9. OP
    OP
    geospiza
    Offline

    geospiza Web Developer Staff Member Web Developer

    212
    449
    215
    Apr 16, 2020
    12:15 AM
    geospiza
    Dark Knight
    146
    Funk
    I kind of eyeballed the box plots I made one for day 30, 60, and 90; while there does seem to be a trend, 2020's median level is lower than 2019's (and even 2018). I'll try it again at some point, but medians are not as easy to compute in sqlite as averages.

    I think the right thing to do is to fit a model to the different populations and test them for statistical significance. There's definitely something there, but it'd be nice to be a little more rigorous. I want a plot like this:


    Alongside testing that 2020 > 2019 > ... > 2016, I also want to see if there are differences between leveling different classes, and see what the effect of events are on leveling.
     
  10. OP
    OP
    geospiza
    Offline

    geospiza Web Developer Staff Member Web Developer

    212
    449
    215
    Apr 16, 2020
    12:15 AM
    geospiza
    Dark Knight
    146
    Funk
    Notebook: https://github.com/geospiza-fortis/ml-player-growth/blob/main/notebooks/2021-05-11-lifelines.ipynb

    So I got around playing with the lifelines library for survival analysis, a branch of statistics that analyzes the time until an event occurs. The library give a way to predict the time until "death," even when "death" hasn't happened yet.

    I choose job advancement levels as a death event (e.g., hitting level 70 means you're dead in this model). The query to prepare the data ends up being very simple:

    Code:
            with recent as (
                select
                    name,
                    max(level) as level,
                    min(timestamp) as start_ts,
                    max(timestamp) as end_ts
                from levels
                where level <= 70
                group by 1
            )
            select
                name,
                julianday(end_ts)-julianday(start_ts) as duration,
                level = 70 as observed
            from recent
    

    The data ends up in the following form.

    upload_2021-5-11_23-55-48.png

    The model takes a duration (the time since the first level in days) and observation (1 if the character hit level 70 otherwise 0). The output is a plot:

    upload_2021-5-11_23-31-29.png

    The line represents the probability that a character is alive (not level 70) over time. After one day, there's a 100% chance that everyone is still alive and not level 70. The median "survival time" is 38.2 days, which means that there's a 50% chance that a character is still leveling after 38 days. (Apologies if the chart is still unclear, I'm still in the process of understanding).

    I modeled survival rates for 2nd, 3rd, and 4th jobs:

    upload_2021-5-11_23-37-26.png

    The plot should make some sense: it takes longer to reach the 4th job than the 3rd job, so the probability of still being alive (not hitting the level) is more likely.

    It's a bit difficult to wrap my head around the model, but it looks pretty. Furthermore, it can give confident answers given the right questions/hypotheses. For example, I could use this model to compare the survival rates of players who start during an event vs. players who start outside of an event. While interpreting, there are also things to consider, such as the sampling method that favors higher leveled players. There may also be biases I'm introducing when preparing the data (such as the unobserved values of players who will never reach a particular level).

    JafelJafel recommended looking at some simpler statistics too; I'll probably take a look at the churn rate of different periods in the next post.
     

    Attached Files:

    • Great Work Great Work x 3
  11. OP
    OP
    geospiza
    Offline

    geospiza Web Developer Staff Member Web Developer

    212
    449
    215
    Apr 16, 2020
    12:15 AM
    geospiza
    Dark Knight
    146
    Funk
    Welp, looks like I did prepare the data wrong. I fixed it up so the duration of players that quit is the time from their first level to the time I scraped, instead of the time to their last level. The last plot up there should look like this:

    upload_2021-5-12_0-55-26.png

    This is much easier to read. As time goes to infinity, ~60% of players make it to second job, ~45% to third job, and ~35% to fourth job. The median ends up being infinity, so I had to choose a larger percentile (the upper quartile or p75). This is the time when 75% of characters are still leveling.
     
    • Great Work Great Work x 4
    • Like Like x 2
    • Friendly Friendly x 1
  12. Floris
    Offline

    Floris Capt. Latanica

    367
    261
    273
    May 27, 2020
    Male
    9:15 AM
    Loving this thread, would definitely be very interested in your idea to determine the leveling speed of various classes!
     
    • Like Like x 1
  13. IHealForYou
    Offline

    IHealForYou King Slime

    28
    23
    21
    Jan 16, 2021
    Male
    9:15 AM
    LegendKnight, IHealForYou
    Dark Knight
    128
    Losers
    What is the fastest time to 120? :O
     
    • Like Like x 1
  14. OP
    OP
    geospiza
    Offline

    geospiza Web Developer Staff Member Web Developer

    212
    449
    215
    Apr 16, 2020
    12:15 AM
    geospiza
    Dark Knight
    146
    Funk
    It's 4.7 days, which is pretty damn fast. Here's a survival curve for players who make it to 120:

    upload_2021-5-12_20-35-26.png

    Roughly 10% of the sampled population makes it to level 120 in 34 days. I also looked at this going to infinity. In the label, I put the fraction of people that make it 120:

    upload_2021-5-12_20-38-35.png

    The fraction from the model (0.45) is comparable to the fraction of players 120+ in the dataset done by simple counting (0.35). The 0.1 difference comes from the fact that the survival model takes into account players who will eventually make it to level 120 (but aren't atm).

    Thanks! [​IMG] So here I took a look at both the "churn" and the leveling speed based on class.

    So first I did a plot of the number of players at a particular level based on the histories:

    upload_2021-5-12_20-42-46.png

    Magestory, amirite? It looks like thieves and warriors come next in terms of number of players, with bowmen and pirates in last. I also plot this as a fraction of the starting players for each job:

    upload_2021-5-12_20-54-25.png

    While there is a large percentage of mages that stick around, there's a larger proportion of bowmen who reach level 200. Neat. I did a similar plot with survival curves for each job. This is similar to looking at the slice of the above plot for level 120, but also including the time to level.

    upload_2021-5-12_20-55-41.png

    If you read the charts carefully, the ranking of the % of people who make it to level 120 is the same. With the second plot, we can also compare the time to level 120.

    upload_2021-5-12_21-2-20.png

    This summarizes the above charts, in addition to a new column p90. This is the time that it takes for 10% of the population to reach level 120. We see that it generally takes much less time to train a magician than any other class (24 days). Pirates have it tough.

    This can also be broken down to compare class by class. Note that the plot above is super noisy because of all of the overlapping confidence intervals. We can just plot two lines side by side:

    upload_2021-5-12_21-4-16.png

    This is easy to read: there are more mages than warriors that make it to 120 and they level faster. The confidence intervals are not touching, so the result is very clear.

    upload_2021-5-12_21-4-55.png

    This is less easy to read because the confidence intervals overlap. We can infer the same as we did previously though. Because we're using a statistical model, we can run a test called the logrank test to quantify the differences. It turns out we get a p-value of 0.05 which means there is a statistically significant difference between these two survival curves. For reference, the p-value of mages vs warriors is <0.005, while comparing mages to themselves results in a p-value of 1.

    I'm pretty pleased with the survival analysis stuff, it's been much easier to use than I anticipated. It also takes care of the cases where players are still active and leveling up. I think up next is to apply the same stuff, but by different starting dates (and maybe taking a look at events too).

    Notebook for today: https://github.com/geospiza-fortis/ml-player-growth/blob/main/notebooks/2021-05-12-by-class.ipynb
     
    • Great Work Great Work x 8
  15. iPippy
    Offline

    iPippy Nightshadow

    661
    344
    345
    May 19, 2019
    Male
    3:15 AM
    iPippy
    Looks like it's time for pirate buffs. Stats dont lie.
     
    • Like Like x 5
    • Agree Agree x 1
  16. d3lm
    Offline

    d3lm Selkie Jr.

    200
    84
    210
    May 11, 2020
    Male
    2:15 PM
    11
  17. OP
    OP
    geospiza
    Offline

    geospiza Web Developer Staff Member Web Developer

    212
    449
    215
    Apr 16, 2020
    12:15 AM
    geospiza
    Dark Knight
    146
    Funk
    More plots! Remember to take them with a grain of salt because of how much they've been sampled.

    Instead of comparing classes, I compared different time ranges. Looking at the weird differences between two months 2017 and 2020:

    upload_2021-5-19_22-15-10.png

    More people end up getting to level 70 in the month 2020 (39% of players in 2017 vs 53% in 2018). They also get there faster on average (29 days in 2017 vs 14 days in 2020).

    After this, I took a look at the survival rate over each quarter since the server start. As a reminder, here are the number of new players per quarter:

    upload_2021-5-19_22-42-26.png

    upload_2021-5-19_22-21-50.png

    Summer of 2016 and 2017 were good for the server; many of these people were more likely to stick around with about 50% of the cohort. Late 2019/early 2020 were also comparatively good. I think the reason that the first two summers peak so much is (1) summer break means more time for maple and (2) the data is skewed toward players from the top of the rankings.

    upload_2021-5-19_22-32-57.png

    There's a notable dip in the time it takes to get to level 70 after 2019. I wonder if this is a side-effect of having more data for those quarters.

    I put this one in a spoiler because it doesn't cleanly fit into the narrative, but the plot is interesting never the less with the level 70 stuff in context.

    upload_2021-5-19_22-27-15.png


    upload_2021-5-19_22-27-35.png

    Finally, did the same again month by month for 2020. Some notable events in 2020:
    • 2020-02-16 Lunar New Year + Valentines patch
    • 2020-05-22 Anniversary patch
    • 2020-08-30 Summer patch
    • 2020-10-31 Halloween patch
    • 2020-12-20 Christmas patch

    upload_2021-5-19_22-34-44.png

    upload_2021-5-19_22-36-37.png

    Those who started slightly before the anniversary event zipped to level 70 (14 days) and have the likeliest chance of survival (55%). It seems like the summer event was also a good time for another batch of maplers.

    With this, I've exhausted this line of thought. I would like a more comprehensive dataset to be more confident in the results, but this sample has been interesting to look at. It looks like the only question that I've yet to answer now is the distribution levels, which will hopefully come next time.

    Bonus plot that doubles as art:

    upload_2021-5-19_22-44-49.png
     

    Attached Files:

    • Great Work Great Work x 3
    • Like Like x 1
  18. whatdatoast
    Offline

    whatdatoast Windraider

    469
    122
    301
    Apr 9, 2020
    12:15 AM
    whatdatoast
    Bowman
    i'm curious if there's a better way to filter out mules. I have a feeling there's a bunch of characters stuck at lvl 16 (storage mules), lvl 41 (HB), lvl 77 (MU), lvl 81 (HS).
    [​IMG]

    This plot seems to support that theory. There's a nice plateau at each of those marks. These mules probably also heavily skew your leveling rate plot, cus more often than not, these mules are leeched.

    Based on how many fame mules are made by a few individuals, i'd reckon they also make up a non-trivial number of new players. Maybe a good way to filter these out is just to strip out the ending numbers and look for duplicates (ShivFameMule001, etc).

    I think another cool plot you can do is "time until next level" for each class (or even job), and filter out long breaks (>1 month maybe). I wonder if there's enough signal to see how various meta training spots effect leveling rate. Some breakpoints could be Ariant questline (31-35), early LPQ, FoG, mp3/gs2, CDs, F/P misting, CB meso bombing, etc. Y̶o̶u̶ ̶c̶a̶n̶ ̶a̶l̶s̶o̶ ̶f̶i̶n̶d̶ ̶a̶l̶l̶ ̶t̶h̶o̶s̶e̶ ̶d̶i̶r̶t̶y̶ ̶c̶h̶e̶a̶t̶e̶r̶s̶ ̶w̶h̶o̶ ̶b̶u̶y̶ ̶l̶e̶e̶c̶h̶.̶ ̶

    (I just noticed you did this already for lvl 200 chars)
     
    • Like Like x 2
  19. OP
    OP
    geospiza
    Offline

    geospiza Web Developer Staff Member Web Developer

    212
    449
    215
    Apr 16, 2020
    12:15 AM
    geospiza
    Dark Knight
    146
    Funk
    I need more data to make the most out of filtering individual characters, but you make a good point about mules. It turns out ~6% of the dataset has names that end with a digit:

    Code:
    ['TicTacToe3', '9812', 'Nightmare96', 'Blu301', 'baratun1',
          'LilTay420', 'Niko28', 'penguinvivi2', 'AlterEg0', 'Stoners95',
          'ElBrujo18', 'Gamecube945', 'SIm13', 'Menche13', 'avoxo0',
          'healer37', 'APRMule1', 'ayshays22', 'Ladyluck1', 'Shaggyx3',
          'RawrPanda96', 'mage1', 'shadowchild5', 'sputnik666', 'Lojista1',
          'Brownie1', 'Beep0', 'Gruncle2', 'LiLAzNb0i24', 'Joonas47',
          'DarkNight01', 'HotBabe69', 'Shu31', 'ConCho123', 'VonLeon96',
          'Suspect79', 'KDTrey6', 'buttcrak360', 'Luuk951', 'DefScrolls1',
          'black2034', 'Wizard101', 'MrMuIe2', 'ShivFAME193', 'ShivFAME196',
          'ShivFAME199', 'ShivFAME202', 'luck17', 'luck18', 'GunNEkuDea01',
          'sfmsfm084', 'afo32', 'Idoco23', 'Pie260', 'Donor94',
          'StoreYouAh2', 'Netto1', 'Alan715', '0100', 'Munklemule6',
          'Parkdragon2', 'Amitrot100', 'Reo01', 'Mule920', 'famed39',
          'famed50', 'charliebb55']
    
    4 of them are Shiv fame mules (unsure whether I should be surprised...). Hard to say how many of these are actually mules outside of the obviously named ones.

    This one might be reasonable with more data, but it's way too noisy as it is now. It's good food for though. A few of those meta training spots should be visible (F/P misting at FoT for sure), but others might be more difficult. If I had access to the leveling location too, it would open up the doors to answer other questions.

    I'm still trying to figure out if there's a general way of modeling time to next level, since the time between the previous levels certainly influences the time to the next. It'd be nice to be able to plug in a level history and get out an estimate. Taking the idea a little further, it's probably doable to estimate the number of leechers by extrapolating from a smaller labeled set of characters. Quantifying the number of leechers in the game's history would be an interesting tidbit to tout.
     
  20. whatdatoast
    Offline

    whatdatoast Windraider

    469
    122
    301
    Apr 9, 2020
    12:15 AM
    whatdatoast
    Bowman
    Hmm interesting, I know for a fact some of these people have more mules...maybe there's some missing? Like ShiveringShivering has like 200+ mules i think?

    But yeah, given the data you have, it seems hard to estimate. The only idea I have is to merge the death timestamps, to act as checks to make sure a player is still "active". But probably not a worthwhile venture.

    I'm curious, how granular is guild data. Is there a history for when people join / leave guilds, or can you only see the current guild a player is in? would be super cool to see the rise and fall of the big guilds in the game. Also would be cool to see the progression for banned users if you can still view their data (i guess in general, anomaly detection for bad actors is useful).
     

Share This Page