Skip to content

Collecting Data

What format is participant data?

Participant data is provided as a CSV file in long format, meaning each row represents a single data point from a participant's response. For example, if your experiment involves mouse tracking, each recorded mouse position will be listed on a separate row. This structure is particularly useful for regression analysis and other statistical methods.

For studies that collect non-text data, such as audio or images, the downloaded data will be packaged as a ZIP archive. This archive contains a master CSV file alongside the corresponding multimedia files. The CSV file includes filenames that link each participant response to its associated file, ensuring all data remains organized and easy to analyze.

What variables are included in the data file?

The data file you download from FindingFive contains a variety of variables that help you analyze participant responses. Each row in the file represents a single response from a participant, meaning each participant appears multiple times across different rows. The data is also randomized by participant to protect privacy.

Key variables included in the file are:

  • Study and Session Identifiers: Each study is assigned an expt_id, while individual sessions are identified by a session_id. These help you track different study runs.
  • Participant Information: The participant_id uniquely identifies each participant, and participation_duration records the total time they took to complete the study.
  • Trial Information: The trial_num column shows the trial number, while trial_template indicates which trial template was used. If you've assigned participant groups, group_id records their assigned group.
  • Response Data: The response_value column contains the participant's response, while response_type specifies the type of response (e.g., choice, text, rating). If the response had a correct answer, response_target lists the correct response, and response_correct indicates whether the participant answered correctly. Reaction times are captured in response_rt.
  • Stimuli Information: The stimuli_presented column lists the stimuli associated with the response.
  • Timing Variables: The trial_duration column records the time between stimulus onset and the start of the next trial.

If your study includes conditional branching, additional columns will appear to indicate how participants were assigned to different branching groups. Some fields, such as network_error_repeat, are primarily for debugging and can be ignored unless you encounter unexpected behavior.

My experiment collects speech production data. What is the format of the audio recordings?

Audio recordings are saved in the open-source Ogg format. This format is widely supported by most audio players and can be easily converted to other formats if needed (using open-source tools like fre:ac).

The participant's audio input is recorded at either a 44,100 Hz or 48,000 Hz sampling rate, depending on their recording hardware. This ensures high-fidelity audio quality. To maximize compatibility, recordings are captured in mono, even when participants use stereo microphones. If the microphone supports stereo input, the channels are mixed into a single mono track.

Does FindingFive provide any participant information beyond the data collected in my experiment?

Yes, but only if you choose to receive it. When launching a session, you can configure whether you want access to participant contact information, including names and email addresses. There are three options: receiving contact information as soon as participants start the experiment, receiving it only after they complete the experiment, or not receiving any contact information at all—making the session fully anonymous.

These options serve different research needs. If you're running studies with your own students, collecting contact information at the start can help with administrative tasks like course credit assignments. If you're recruiting from our in-house participant pool, keeping the session anonymous may make it more appealing to privacy-conscious participants.

Regardless of whether you choose to receive contact information, FindingFive still tracks participation history internally. This means that features like setting prerequisites between experiments will continue to function as expected.

Why can't I get my data while a session is running?

This may be our most frequently asked question! The main reason is privacy protection.

In most session types, FindingFive provides researchers with the contact information of participants who complete their study. This is especially useful when researchers are tracking participation among their own students. However, if participant data could be downloaded while a session is still active, researchers could compare the newly available data with the list of completed participants in real time, making it possible to infer which responses belong to which individuals. This would effectively link participant identities to their data, a practice explicitly prohibited by most IRBs and one that participants expect to be avoided.

To safeguard participant privacy and prevent any unintended identification, FindingFive only allows data to be downloaded once a session is complete or canceled.

What identifiable personal information does FindingFive remove from participant data?

Once a study session is finished, researchers can download participant data. Each participant is identified by a special ID that is non-reversibly derived from their FindingFive ID and the researcher's ID. This means that a participant will have the same ID across all studies run by the same researcher, but their ID will be different across researchers to protect privacy. These generated IDs do not exist in the FindingFive database, ensuring that participant identities remain separate from their response data.

What if I must retain the ability to link participant identity to their data?

If your IRB has explicitly approved the collection of personally identifiable information, you can gather this data as part of a response in your study. However, to protect participant privacy, the study must be set up as an encrypted study, ensuring that all collected data remains encrypted in storage.

Researchers who collect personally identifiable information are fully responsible for safeguarding participant data after downloading it. It is crucial to follow best practices in data security and comply with any IRB or institutional guidelines regarding data protection.