Data extracts – editing, coding and weighting

Data extracts can be viewed and edited from the Data extracts page. Each data extract is shown by date, and the last data extract created is the default one used for reporting (Reports can also specify and link to specific data extracts, rather than the latest one).

 

View Data Extract brings up the editor view, showing the data table with column names and a row per respondent record.

 

From the File options… in the top right, you can download the data extract to a .csv file.

 

Data edits

 

Data edits is a process of marking up changes to the data file and saving them in an ‘edits’ file where they can be applied to future data extracts.

 

The benefit of an edits file is that the edits list can be applied to any future data extract file at a click of a button. And the edits file itself can be saved and extended with new edits.

 

It means you always track what has been done to the data file. This can be a little confusing at first, but means there is a full audit trail for the data and what has been done to it, instead of tweaking individual data points by hand and then losing track of what is original and what were adjustments.

 

Each cell in the Data Extract Editor can be marked with an edit by clicking on the cell contents. This allows you to type in an ‘edit’ that can be saved and then applied when creating a new data extract. Once you have all the edits, click Save to Server to save the edits to an edit file.

 

For complex projects, you may have several edit files, and can pool together edit files from multiple people working on the data.

 

You can bring in a pre-existing edit file by using Load external edit file, applying the edit files as necessary, with the option to save the file combined set of edits using Save to Server.

 

Because cell-by-cell editing is not very efficient you can create a list of edits directly in a spreadsheet. This enables you to download the Data Extract as a .csv file and then create edits in the spreadsheet, rather than the data editor.

 

The edits can be saved as a .tsv file with headings corresponding to the columns that change and a row per respondent, or as a three column tsv list with respondent, column and value.

This form of Data Edit files can also be used to add weighting (for the weight column) and open-ended coding.

 

Coding of open-ends

 

In the data extract each open-end question, say ‘views’ automatically generates a column for coding starting with oe_ for instance oe_views

 

The idea is to create a set of edits that create values for this oe_ column.

 

For open-ended coding, the column should contain a comma-separated list (eg 3,6,11) per respondent, or blank if no codes apply. The code you choose should correspond to a coding frame created at the start when you review all the open-ended comments.

 

In a spreadsheet code the open-ends, then save the coding as a .tsv edit file with resp_id, column name and codelist (use .tsv tab-separated format, not a .csv format).

 

To include this into a report, add an ‘Open Ended Coding’ question into the report with a name corresponding to the oe_ column. This will read the comma separated list and turn the data into a chart or table.

 

We recommend doing coding in a spreadsheet (use Autofilter and sort the open-ends as it typically groups similar answers together), or use an automated AI-based coding app to create the code-frame and generate the coding in a form that can be imported as an edit file.

 

Once the open-ended edits are created they can be applied to any new data extract with ease.

 

Weighting

 

Data extracts contain a column for weight, which is set at 1 by default. If you want to weight the data to adjust for imbalances in the sample, then you can create an edit file with respondent, weight, value and use this edit file on the data extract.

 

The process of weighting is a little beyond this manual, but you will need to know and compare the population statistics to the statistics of the sample that you have achieved.

 

A basic weight is often calculated as weight=population statistic/sample statistic.

 

More complex methods include interlocking weighting and rim weighting with each individual in the sample getting a different weight.

 

We recommend using a spreadsheet to do the weighting calculations, then generate an edit file for the Data Editor to update the data extract with the weights you have created.


Previous article: Chart and slide notes – live data and interaction
More details

Go to Notanant menuWebsite accessibility

Access level: public

Page feedback

This site uses essential cookies only. By continuing to use this site you accept our use of cookies: OK
Show or hide the menu bar