12  Common Errors and Warnings in the API

If an analysis request to the SEDT API fails, users may get information about that failure at the “Upload User Files” or the “Get Output Data Status” endpoints.

  1. “Upload User Files” endpoint: An error is indicated by a non-201 HTTP status code in the API response.
  2. “Get Output Data Status” endpoint: An error is indicated by a non-200 HTTP status code in the API response OR a value of True in the error-messages field of the API response.

We’ll discuss the common error and warning messages users may receive from each endpoint below.

Upload User Files Endpoint

The Upload User Files API endpoint returns error information via the status_code and error_message fields in the API response. Common potential errors include the following:

Error Message Error Description HTTP Status Code
“Only CSV and TSV files are allowed!” Raised when resource or supplemental demographic/geographic file is not a CSV or TSV file 400
“413 Request Entity Too Large” Raised when resource or supplemental demographic/geographic file is larger then 200 MB. The API returns the following response: "<html>\r\n<head><title>413 Request Entity Too Large</title></head>\r\n<body>\r\n<center><h1>413 Request Entity Too Large</h1></center>\r\n<hr><center>cloudflare</center>\r\n</body>\r\n</html>\r\n" 413
“No appropriate CSV encoding. CSV can’t be read in.” or “Invalid encoding.” Raised when resource or supplemental demographic/geographic file is not the “utf-8” or “ISO-8859-1” encoding 400
“[column] is expected to be a column in [columns]” Raised when the specified lat, lon, or weight column is missing from the resource file, or when specified geoid, estimate, or margin of error columns are missing from supplemental demographic/geographic file 400
“Columns cannot hold more than 10 items” Raised when baseline_columns or resource_columns field in Upload User Files request has more than 10 columns specified 400
“Column cannot be empty” Raised when baseline_columns or resource_columns field in Upload User Files is False 400
“[column] must be a string” Raised when key or value in baseline_columns or resource_columns fields is not a string or nonetype 400
“[geo_value] is not in ‘national’, ‘state’, ‘county’, ‘city’” Raised when the geography value is not one of “national”, “state”, “county”, “city” 400
“ACS Data Year should start with ‘20’, and the length should be four digits.” Raised when an invalid value is provided for the acs_data_year field in the Upload User Files request 400
“[Demographic/Baseline] Geo ID Column cannot be blank” Raised when geoid column field is blank for an included supplemental demographic or geographic file 400
Invalid file_id Raised when a user submits a request to the Get Output Data Status or Get Output Data endpoint with a file_id field that does not exist. There is not a specific error message associated with this error, but the http_status code will be 500 and the results$file_exists key will have the value of FALSE. 500

Get Output Data Status Endpoint

The “Get Data Status” API endpoint returns information on a set of potential errors and warning messages for each analysis submitted to the API. We provide more information on these errors and warnings below. For more information on the endpoint and response format, see Chapter 10.

Error Messages

These messages are provided when an error occurs that causes the analysis to fail. If any error message is returned in the “Get Data Status” response, the error-messages field within the updates field of the response will be set to True.

Error Message Error Description
form-data-parameter-validation-failed Indicates whether the Upload User Files request body is misformatted. Please see Chapter 10 for more information.
data_readin_error Indicates that one of the resource or supplemental geographic or demographic datasets provided by the user cannot be read in for analysis. One reason this may occur is because the file does not have an appropriate CSV encoding.
df_conversion_to_gdf_failed Indicates whether the resource dataset can be successfully converted to a Pandas geodataframe during the analysis.
weight_coltypes_mismatch Indicates that the weight column provided by the user is not a numeric column and cannot be converted to numeric.
pts_not_in_any_geography Indicates that the points in the resource dataset do not fall within any applicable geography for the analysis. The tool will also try switching the latitude and longitude column designations to see if that resolves the issue before throwing this error. For more information, see Chapter 5.
sjoin_failed Indicates that the spatial join between the points in the user-provided resource dataset and the census tract polygons in the geography identified by the tool has failed.
unable_to_generate_presigned_urls Indicates that the process of generating the presigned urls for downloading the geographic and demographic analysis output has failed.
dem/geo_cols_not_in_data Indicates that supplemental demographic or geographic data columns provided in the API request are not in the corresponding dataset. Column names are case sensitive.
dem/geo_colls_cannot_be_converted_numeric Indicates that at least one of the supplemental demographic or geographic columns provided in the API request cannot be converted to numeric columns. All estimate and margin of error columns must be numeric or be able to be converted to numeric types.
invalid_column_in_data Indicates that one of the user-provided datasets has a column called “GEOID_urbaninstitute”, which is a reserved column name for the system. This issue can be resolved by renaming the column prior to upload.
latlon_cols_not_in_data Indicates that the specified latitude or longitude column is not present in the resource dataset
lat_lon_cannot_be_converted_numeric Indicates that the latitude and/or longitude columns provided by the user are not numeric columns and cannot be converted to numeric.
all_rows_filtered Indicates that the filter conditions specified by the user caused all rows of the resource dataset to be dropped. Only occurs with the GUI tool.

Warning Messages

These messages are provided to alert the user to behavior that may be indicative of a concern but does not cause the analysis to fail.

Warning Message Warning Description
multiple_geographies_flag Indicates whether the resource datafile uploaded by the user has points that span multiple geographies. For example, if a user selected a state-level analysis and uploaded a resource file with points in multiple states, this would be True. In such a case, the tool uses the geography with the most points.
num_null_latlon_rows_dropped Indicates the number of rows in the resource dataset dropped because they have null values in the latitude or longitude columns provided by the user. If no rows are dropped, this will be None.
num_null_weight_rows_dropped Indicates the number of rows in the resource dataset dropped because they have null values in the weight column provided by the user. If no rows are dropped or if no weight column is specified, this will be None.
num_out_of_geography_rows_dropped Indicates the number of rows in the resource data set dropped because they fall outside of the geography of analysis. This typically occurs when the resource dataset spans multiple geographies (see multiple_geographies_flag for more information). If no rows are dropped this will be None.
multiple_geographies_list If multiple_geographies_flag is true, this is a list of the geographies that points in the resource data file fall within. Otherwise, this is None.
few_sub_geos_flag Indicates that the user-provided resource dataset has points in less than 50% of states for a national analysis, counties for a state analysis, and tracts for a city or county analysis.
dem/geo_dropped_over_half_values_greater_than_total_pop Provides the names of the supplemental demographic and geographic dataset columns dropped from the analysis because more than half of the values are greater than the total population for the given subgeography. Note that this only applies to estimate columns as margin of error values can validly be larger than the total population.
dem/geo_values_greater_than_total_pop Provides the names of the supplemental demographic and geographic dataset columns that contain any values that are greater than the total population for the given subgeography. Note that this only applies to estimate columns as margin of error values can validly be larger than the total population.
dem/geo_dropped_over_half_values_negative Provides the names of the supplemental demographic and geographic dataset columns dropped from the analysis because more than half of the values are negative.
dem/geo_dropped_any_values_negative_margin Provides the names of the supplemental demographic and geographic dataset margin of error columns dropped from the analysis because any of the values are negative. If the margin is dropped but the corresponding estimate column is kept, the disparity scores for that column will not be statistically significant.
dem/geo_values_negative Provides the names of the supplemental demographic and geographic dataset columns that have any negative values.
dem/geo_float_values Provides the names of the supplemental demographic and geographic dataset columns that have any float values.