Welcome to the Forum

Creating an account is currently only possible via registration at SimFin.

Missing data in specific quarters while extracting from API

Hi,
I have found a lot of companies which the data is missing for a specific quarters while extracting the data quarter-wise using the API.
For example, data for VOXX is missing for Q4-2018,Q4-2017,Q4-2016 etc.


While the data does exists in SimFin database:


Can anyone provide help on why the data is missing while extracting from the API tool?

Comments

  • ok that's weird, can you post the exact API call you are making?
  • edited January 8

    ok that's weird, can you post the exact API call you are making?

    Yes, it's almost identical to the python tutorial code:
    
    It's mostly the standard format from the tutorial:
    import requests
    import pandas as pd
    
    # here you have to enter your actual API key from SimFin
    api_key = "****"
    
    path = './simfin_consolidated/'
    # list of tickers we want to get data for
    tickers = [
    "VOXX"
    ]
    
        
    # first: find SimFin IDs
    sim_ids = []
    for ticker in tickers:
    
        request_url = f'https://simfin.com/api/v1/info/find-id/ticker/{ticker}?api-key={api_key}'
        content = requests.get(request_url)
        data = content.json()
    
        if "error" in data or len(data) < 1:
            sim_ids.append(None)
        else:
            sim_ids.append(data[0]['simId'])
    
    print(sim_ids)
    
    final_df = pd.DataFrame()
    # define time periods for financial statement data
    statement_type = ["pl","cf","bs"]
    time_periods = ["Q1", "Q2", "Q3", "Q4"]
    year_start = 2007
    year_end = 2019
    # set up the XLSX writer
    #writer = pd.ExcelWriter("simfin_data_template.xlsx", engine='xlsxwriter')
    
    data = {}
    # get data for each ticker/id
    for idx, sim_id in enumerate(sim_ids):
        for st in statement_type:
            d = data[tickers[idx]] = {"Line Item": []}
            if sim_id is not None:
                for year in range(year_start, year_end + 1):
                    for time_period in time_periods:
    
                        # make time period identifier
                        period_identifier = time_period + "-" + str(year)
    
                        if period_identifier not in d:
                            d[period_identifier] = []
    
                        request_url = f'https://simfin.com/api/v1/companies/id/{sim_id}/statements/standardised?stype={st}&fyear={year}&ptype={time_period}&api-key={api_key}'
    
                        content = requests.get(request_url)
                        statement_data = content.json()
    
                        # collect line item names once, they are the same for all companies with the standardised data
                        if len(d['Line Item']) == 0 and 'values' in statement_data:
                            d['Line Item'] = [x['standardisedName']+'_'+st for x in statement_data['values']]
    
                        if 'values' in statement_data:
                            for item in statement_data['values']:
                                d[period_identifier].append(item['valueChosen'])
                        else:
                            # no data found for time period
                            d[period_identifier] = [None for _ in d['Line Item']]
    
                # fix the periods where no values were available
                len_target = len(d['Line Item'])
                if len_target > 0:
                    for k, v in d.items():
                        if len(v) != len_target:
                            d[k] = [None for _ in d['Line Item']]
    
        # convert to pandas dataframe
            df = pd.DataFrame(data=d)
            final_df = final_df.append(df)
        if len(final_df.index) > 0:
        # save in the XLSX file configured earlier
            writer = pd.ExcelWriter(path+ tickers[idx]+".xlsx", engine='xlsxwriter')
            final_df.to_excel(writer) 
            writer.save()
            writer.close()
        final_df = pd.DataFrame()
    
  • closed as this issue has been addressed by e-mail.
This discussion has been closed.