Posted by wrs1864b at Sep 22, 2018 4:57:28 PM
Re: Community Shopkeeping Spreadsheets Preview and Feature Requests
Do I cull outliers for data?

Arguments for: Well, just look at it. Someone's offering to sell iron at 10k/each, see what it does to a chart.

This isn't useful to anybody.

Arguments against - The instant I start editing the data from which things are created is the instant it goes from pure 'Visualizer' of information to something with changes to the dataset.

If I do cull outliers, by what criteria makes sense?

As I said in my first post to this thread, trying to find a good "estimated cost" can be tricky and something I beat my head against for quite a while.

First, I highly recommend you do what makes sense to you.

For my "market value" estimates, I definitely heavily cull and munge the data. As you point out, this means I can't quickly explain how I get my numbers. I ended up focusing on the definition of "the price you can expect if you buy/sell a reasonable quantity if you are willing to sail a bit" and if my program didn't give a reasonable answer, I'd tweak it toward this definition.

First, I take all the data, and trow out "bogus prices". That is, I throw out anything that is too far from my "market value".

Then, I use a heavily truncated mean. I throw out almost the top 5%, and almost the bottom 95% of the culled data, and average the remaining <1% sliver.

Next, since I now have a "market value", I know what to bogus price points to throw out on the first step!

Basically, I just keep repeating the above steps until my market value number converges. I can prove that it will always converge eventually and in practice it only takes 2-3 tries.

This means that a "reasonable amount" is up to a few percent of the market, and since I don't consider most of the top 5%, you don't have to sail to every island to get the absolute best price.

Yeah, how I get my market values is messy, and if you tried to do something similar, you would almost certainly end up with slightly different numbers. In practice, I've found that there isn't a perfect number, and as long as they are reasonable, you are fine. After all, the game is designed to have prices fluctuate, you can never hit this moving target.


Again, try to get an answer that you find useful.

edit: D'oh! the whole point of my rambling was that my program that plots the price curve normally shows only data between 0.05*mkt_price and 5*mkt_price. I have an option to show everything.
