Welcome to the Forum

Creating an account is currently only possible via registration at SimFin.

Some feature Ideas

Hey ,
I like the idea of this project very much. I've been working on the data sets a lot lately, mostly using the bulk download feature it to derive some value-based stock screeners according to fundamental data (I'm trying to implement a "magic formula" stock peeker - details here, for whoever is interested).

So far a few things came to mind for me so I'm listing them as ideas because they might save some work for others:

1. MarketCap is missing from the data sets - I can understand this is price related and can be derived from number of stocks * price , but since number of shares is also a "fundamental" data (and is only available once per quarter in the price data sets) I think it would be very useful to add this to the fundamental data sets. It is very useful for all basic screening operations . almost any screener would use market cap as a criteria. I would favour having this as part of the fundamentals dataset .
2. Ratios and other derived values - I can see EV, for example, is present in the "find" option but is missing from the bulk download sets , also all other ratios and metrices seem to be missing - if the data is already there - might be nice to add it.
3. working capital - personally I'm looking for that , I can implement this myself in python from the present values , but it might be useful for others as well.
4. Maybe have more options to reduce the size of the bulk data sets - for example get just last report , yearly vs. monthly reports. for the price sets - maybe have something which is not per-day like monthly average.
5. Price set by itself without the fundamentals can be useful also , as it will be smaller.
6 . The download data option in the "find" section only allows downloading xlsx files the data , I would favour a CSV option to be able to parse it offline in python (xlsx is harder to handle for these tasks)

Tell me if I got something wrong as these are just rough thoughts.
I'll add more stuff as I come across , currently working on implementing all the above offline in Python - suggestions are also welcome .... :-)

Comments

  • edited November 2017
    Hi ran_ma,

    some great ideas here!

    1.+2. market cap should be available in the "Stock prices + Fundamentals (Detailed)" data sets, as well as EV.
    There are some problems though with shares outstanding currently with some companies (mostly for cases when there are several share classes, these are not weighted correctly currently when calculating the total market cap, see VISA for example), that's why there might be still some weird values for market cap in the dataset. I am currently reworking completely the shares outstanding crawler so it should be more accurate with the new update (goes online in a few days hopefully).
    3. You can find change in working capital in the "detailed" data sets on the bulk download page (not in the "basic" data sets though).
    4. I am planning to release an API asap, but I have a lot of things to do currently, so I think the API will solve the issues when only a share of the data is relevant for someone, or do you disagree?
    5. I can add a price set only download if you are interested in that!
    6. Ok good point, I'll add a CSV option for the "normal" downloads too then
  • I'll check about the Market cap and EV - didn't see that in the detailed set but will check again later.
    If it's there then I, personally, have no need in a price only dataset (I only wanted that to get the market cap) . I do remember the detailed data set is on daily basis - which is what makes it so big, although the only thing that is relevant of a daily basis is the price - so for that matter it would make sense to separate the price dataset from all other fundamental data and have two data sets with different time granularities.

    Regarding working capital - I think there should be a "working capital" column , and not just "change in working capital" (this is what I'm looking for, anyway) .

  • O.k. - so I see why I missed that .
    for the bulk data downloads -
    1. Fundamentals detailed is quarterly , and doesn't have MarketCap and EV
    2. Fundamentals + prices detailed - have EV+market cap but is daily

    So I guess what I would be happy to get is a mix of them both ...
    I was under the assumption they should hold the same fundamentals data and only differ in the sense where #2 holds prices as well .
  • Hey Thomas,
    Is there a way to get a "sample" of the full DB (fundamentals detailed + prices) to work on (only 1-2 companies for example) ? it just takes a lot of time to work on the entire DB every time while developing code.
  • Hey, sorry I missed your post here, do you have something by now or should I send you a test file?
  • I'm good - I created one myself.
    Thanks anyway!
  • Hey @tflassbeck ,
    Just wanted to update on my progress with the above project - as it might be useful for SimFin development and others as well.
    I got it all running using the simian bulk download , but then I ran into a problem.
    out of the ~1500 companies on file, only ~400 had 4 quarterly reports for 2016 I could use for analysis, some had less then that , some had none.
    So this is kind of not what I was hoping for - as I can't really make use of the results I get if more then 50% of the data is not complete. It's understandable, considering the amount of work needed to get this service running and working and the fact this is still in early development stages. And I can see that it's growing with every day passing.
    Anyway - I then started a different approach - I'm now crawling the SEC servers directly for XBRL files, parsing them and building my database from that.
    I'm doing that using python packages I wrote and patched together from all over. It works nicely - I now have a database of 3000+ stocks with marketCap > 100e6 , all with standardised data.
    I'm checking in here just in case the system I have running now can be useful for SimFin as well - in order to get bulk standardised data.
    If you want more details - let me know.
  • edited January 2018
    Hi ran,

    We are currently running a big data update and are correcting all the remaining mistakes for most companies at the moment, so in a few weeks all data will always be up to date and mostly error free (the remaining errors/problems are not displayed in the bulk download, that's why there are quite a few "gaps" at the moment).

    We are also parsing the XBRL files, but these are a) not standardised and b) have plenty of errors/mistakes, so that's what we are mostly trying to fix with our current software (not sure how you are doing this, since it does require some manual input in my opinion no matter what).

    Anyways would be curious to see your system running.

  • All companies should be updated now in the dataset. Values for FY 2017 are added these days now when they are being released.
  • Hey @tflassbeck ,
    I'll download the new batch and look into that - Thanks for the update !
Sign In to comment.