User Guide

1. Getting Started

1.1. Accessing your Account

  1. Log in to the Texas Data Repository.
    • Click “Log in” in the navigation on the top right.
  2. If you are from a participating Texas Data Repository institution:
    • Select your institution from the drop-down menu of participating member institutions.
    • Log in using your institutional credentials.
  3. If you are NOT from a participating Texas Data Repository institution:
    • Enter your username/email and password in the “Dataverse Account” of the login page.


If it is your first time logging into the system you will be asked to review and agree to the Texas Data Repository Terms of Use.

1.2. About Dataverse, Metadata, and End-Users

A dataverse is a container for datasets (research data, code, documentation, and metadata) and other dataverses, which can be setup for individual researchers, departments, journals, and organizations.

Once a user creates a dataverse they, by default, become the administrator of that dataverse. The dataverse administrator has access to manage the settings described in this guide. As the administrator of any dataverses you create, you can choose to upload your datasets to them. You can also link any existing dataverse or dataset within the TDL Data Repository to your dataset, where it will appear in the table of contents among any datasets you have uploaded yourself.

A researcher may choose to create a dataverse to contain the various datasets associated with a particular grant, research collaboration, or project, or may wish to collect all data associated with them under a single personal dataverse. One may also upload datasets to the TDL Data Repository without first creating a dataverse to contain them.

Each Texas Digital Library member institution has an institutional dataverse intended to capture all dataverses and datasets created by that institution’s researchers. An institutional repository library at each member institution will curate the institutional dataverse, and will take responsibility for linking each dataset and dataverse created by institutional researchers to the institutional dataverse.

schematic diagram of a dataverse in dataverse 4.0

2. Creating a Dataverse

2.1. Creating a Dataverse (i.e., collection)

1. Once you are logged in, click on the “Add Data” button and in the drop-down menu select “New Dataverse”.

screenshot

2. Once on the “New Dataverse” page, fill in the following fields:

  • Name: Enter the name of your dataverse
  • Identifier: This is an abbreviation, usually lowercase that becomes part of the URL for the new dataverse. Special characters (~,’,!,@,#,$,%,^,&, and *) and spaces are not allowed. Note: if you change the Dataverse URL field, the URL for your Dataverse changes (http://…/’url’), which affects links to this page.
  • Email: This is the email address that will be used as the contact for this particular dataverse. You can have more than one contact email address for your dataverse.

screenshot

  • Affiliation: Add any Affiliation that can be associated to this particular dataverse (e.g., project name, institute name, department name, journal name, etc.). This is automatically filled out if you have added an affiliation for your user account.
  • Description: Provide a description of this dataverse. This will display on the homepage of your dataverse and in the search result list. The description field supports certain HTML tags (<a>, <b>, <blockquote>, <br>, <code>, <del>, <dd>, <dl>, <dt>, <em>, <hr>, <h1>, <h3>, <i>, <img>, <kbd>, <li>, <ol>, <p>, <pre>, <s>, <sup>, <sub>, <strong>, <strike>, <ul>).

screenshot

  • Category: Select a category that best describes the type of dataverse this will be. For example, if this is a dataverse for an individual researcher’s datasets, select Researcher. If this is a dataverse for an institution, select Organization & Institution.
  • Choose the sets of Metadata Elements for datasets in this dataverse: by default the metadata elements will be from the host dataverse that this new dataverse is created in. Dataverse offers metadata standards for multiple domains.

screenshot

  • Select facets for this dataverse: by default the facets that will appear on your dataverse landing page will be from the host dataverse that this new dataverse was created in. The facets are simply metadata fields that can used to help others easily find dataverses and datasets within this dataverse. You can select as many facets as you would like.

screenshot

3. Selected metadata elements are also used to pick which metadata fields you would like to use for creating templates for your datasets. Metadata fields can be hidden, or selected as required or optional. Once you have selected all the fields you would like to use, you can create your template(s) after you finish creating your dataverse.

4. Click “Create Dataverse” button and you’re done!

screenshot

5. Once  your dataverse is ready to go public, go to your dataverse page, click on the “Publish” button on the right hand side of the page. A pop-up will appear to confirm that you are ready to actually Publish, since once a dataverse is made public, it can no longer be unpublished. Note: a dataverse must be published before a dataset within that dataverse can be published.

screenshot

*Required fields are denoted by a red asterisk.

2.2. Preparing Data, Code, and Additional Documentation

Dataset files must be 2GB or smaller for direct upload. For larger files, contact the TDL Helpdesk: support@tdl.org

If your data requires special instructions, disclaimers, field definitions, etc., please prepare a README.txt file to accompany your data files. (See Georgia Tech’s README template)

2.3. Creating Metadata

During the upload process, Dataverse will require certain metadata fields describing your dataset be completed. It is recommended to prepare metadata prior to upload, in order to resolve any questions that may arise during the upload process regarding how to populate the required and optional fields. Important: metadata is the primary means by which your data become discoverable and usable to end users, as it provides descriptive terms for search engines to match to end-user search terms, and provides valuable context and information about how, where, when, and for what purpose your data were created for downstream users exploring your data for the first time. Making data discoverable and usable to end users increases data citations (Gleditsch, Metelits & Strand, 2003; Piwowar, Day & Fridsma, 2007; loannidis et al., 2009; Pienta, Alter & Lyle, 2010; Henneken & Accomazzi, 2011; Sears, 2011; Dorch, 2012; Piwowar & Vision, 2013).

Below are the required Metadata fields for Dataset upload (hover over field names on the form to display more information):

  • Title: Full title by which the Dataset is known.
  • Name (Author): The author’s Family Name, Given Name, or the name of the organization responsible for this Dataset.
    • Format: Personal name expressed as Last Name, First Name, Middle Initial. Organizational name as it appears. Examples:
      • Obama, Barack H.
      • Texas Digital Library
  • Contact with email: The email address(es) of the contact(s) for the dataset. This will not be displayed to the user.
  • Description (Text): A summary describing the purpose, nature, and scope of the dataset.
  • Date: In cases where a dataset contains more than one description (for example, one might be supplied by the data producer and another prepared by the data repository where the data are deposited), the date attribution is used to distinguish between the two descriptions. Date expressed in ISO format (YYYY-MM-DD). Example: 2016-01-30
  • Subject: Domain-specific Subject Categories that are topically relevant to the dataset.
  • Production Date: Date when the data collection or other materials were produced (not distributed, published, or archived). Date expressed in ISO format (YYYY-MM-DD). Example: 2016-01-30
  • Production Place: The location where the data collection and any other related materials were produced.
  • Kind of Data: Type of data included in the file. Formatted as free text.
    • Examples:
      • survey data
      • census/enumeration data
      • aggregate data
      • clinical data
      • event/transaction data
      • program source code
      • machine-readable text
      • administrative records data
      • experimental data
      • psychological test
      • textual data
      • coded textual
      • coded documents
      • time budget diaries
      • observation data/ratings
      • process-produced data

Other fields (not required). Please refer to the Texas Data Repository Metadata Guidelines for the definitions of each term:

  • Keywords
  • Notes
  • Depositor
  • Deposit Date
  • Related Publication Metadata:
    • Citation
    • ID Type
    • ID Number
    • URL
    • Notes
    • Language

3. Uploading and Sharing your Data

3.1. Upload a Dataset and Describe it with Metadata

  • Upload and provide information about dataset(s) to your Dataverse. (Alternatively you can upload datasets directly into your institutional Dataverse.)
  • Upload your datasets by clicking Add Data and selecting New Dataset.

screenshot

  • Complete the required citation metadata fields and upload your data files and documentation by selecting + Select Files to Add.

Screenshot highlighting the "Select Files to Add" button on the webpage.

  • Select Save Dataset to add your unpublished data files and documents to your Dataverse.

Screenshot highlighting the "Save Dataset" button on the webpage.

  • Metadata can be edited by clicking Edit Dataset and selecting Metadata.

screenshot

  • Datasets can be edited by selecting a dataset and clicking Edit.

screenshot

  • Additional data files may be added by clicking Upload Files in the files tab.

screenshot

  • You may restrict access to your datasets as well.

3.2. Share an Unpublished Dataset

  • Users and Groups can be granted permissions to access unpublished datasets with varying levels of permissions for viewing, modifying, and contributing to the dataset. For more on roles and permissions, go here.
  • From the dataset page, click Edit => Permissions => Dataset.

screenshot

    • Click “Assign Role for Users/Groups.”
    • Begin entering the username of the user you would like to add to the dataset. As you type, options will appear. (The person you are sharing the dataset with must have a user account in order to be invited to collaborate.)
    • Select the appropriate role for the user, depending on the level of access you wish them to have.

screen shot

3.3. Publish Dataset

  • When the submitter is ready to make a dataset available to the public, the dataset may be published by clicking Publish. Depending on your institution’s policy, you may instead see “Submit to publish” and your dataset will be submitted to the Data Repository Librarian at your institution for publication.

screenshot

  • A pop-up notification will confirm that you are ready to publish the dataset.

screenshot

  • The dataset cannot be unpublished, but it can be edited and, if necessary, deaccessioned.
    • IMPORTANT: Each time a dataset is edited after publication, it will need to be republished for edits to appear to users.

3.4. Share Restricted Files


For more about how to restrict access to files in a dataset, see section 4.2.

  • Users and Groups can also be granted permissions to access restricted files within a published dataset, using a similar method to granting access to unpublished datasets (see 3.2 Share Dataset). Granting permissions at the file level automatically assigns the “file downloader” role to the user.
  • From the dataset page, click Edit => Permissions => Restricted Files.
  • Select Grant Access to Users/Groups.screen shot
  • A “Grant File Access” popup window appears.
    • Select the individual files you would like to share.
    • Begin entering the username of the user you would like to add to the dataset. As you type, options will appear. (The person you are sharing the dataset with must have a user account in order to be invited to collaborate.)screen shot
  • Click “Grant.”

Alternative method for sharing restricted files: Private URLs

  • You can share an unpublished dataset widely with users who do not have Texas Data Repository user accounts by creating a private URL.
  • From the dataset’s web page, select Edit => Private URL.
  • In the popup window, click “Create Private URL.” The URL that appears can be shared with trusted users to provide them access to the dataset without requiring them to log in to the repository. Click “Close” to complete the process.screen shot
  • You may choose to disable the Private URL at any time by navigating back to the Private URL management popup and clicking “Disable Private URL.”screen shot

3.5. Adding New Users to your Dataverse or Dataset

To give other users roles within your dataverse or dataset, those users must first have user accounts in the TDR system.

A user is available in the database if they have logged into the system once.

See section 3.2 Sharing an Unpublished Dataset for more information about adding the user (once their account is created) to the dataset.

3.6. Assigning Individuals or Groups to another Dataverse

The Texas Data Repository allows dataverse administrators to add other users and groups to an existing dataverse. The administrator is able to assign specific roles for each user or group using the Permissions and Groups modules found in the edit dropdown menu for any dataverse.

screenshot

The Permissions page is divided into three modules: Permission, Users/Groups, and Roles. The Permissions module lists the default roles and permissions of a dataverse for all members of the Texas Data Repository. The Users/Groups module allows a dataverse administrator (usually the creator) to assign specific roles to specific users/groups. The Roles module is a guide that explains the roles and functionality assigned to each type of user account.

screenshot

Under the permissions module, you can determine the default role for a dataverse. Default permissions state that anyone to a dataverse needs to be given access by the owner and their designated role in the dataverse is one of contributor.

This can be modified by:

1. Selecting the Edit Access button.

screenshot

2. Once selected, the user has the option of narrowing or broadening default access levels for any member of TDR who would want to contribute to a dataverse.

a. Note: Roles for specific individuals or groups are available through the Users/Groups module in the Permissions interface of TDR.

screenshot

To assign a new role to a specific user, use the Users/Groups module.

screenshot

1. Select the “Assign Roles to Users/Groups” button.

screenshot

2. Enter the name of the User or the Group first.

a. Note: When you enter a name, it auto fills with names that are already registered with the TDR.

screenshot

3. Then select the role that they should be assigned.

screenshot

4. Once you save changes, the users or groups will be listed in the Users/Groups module. This is the location where you can delete any assigned roles.

screenshot

It is also possible to create groups and assign them roles using the Groups module in TDR.

To add a new group:

  1. Select Groups under the Edit dropdown menu
  2. Select the Create Groups button

screenshot

3. Enter information about the group and select the users or other groups that will make up this group.

screenshot

4. Once created, a notification will appear that states you have successfully created the group. Refresh the page to see the group listed under the Users/Groups module. use this module in the future to edit or delete the group.

screenshot

3.7. Adding Data to Another Dataverse

If you have a Contributor role (can edit metadata, upload files, and edit files, edit Terms, Guestbook, and Submit datasets for review) in a sub-dataverse you can submit your dataset for review when you have finished uploading your files and filling in all of the relevant metadata fields.

1. To Submit for Review, go to your dataset and click on the “Submit for Review” button, which is located next to the “Edit” button on the upper-right.

screenshot

2. Once Submitted for Review: the Administrator or Curator for this sub-dataverse will be notified.

screenshot

screenshot

screenshot

3. The admin for the sub-dataverse will be asked to either “Publish” the dataset or “Return to Author”. If the dataset is published, the contributor will be notified that it is now published. If the dataset is returned to the author, the contributor of this dataset will be notified that they need to make modifications before it can be submitted for review again.

screenshot

4. Managing Your Data

4.1. Alter Default Terms of Use

For more on Licensing and Permissions, go here.

  • Alter the default Terms of Use for your dataset(s). Default is Creative Commons Zero (CC0) (i.e., a public domain dedication that reserves no rights).
  • Set up Terms of Use by clicking Edit and selecting the Terms tab.

screenshot

  • The default terms is a CC0 public domain dedication. CC0 facilitates reuse and extensibility of research data.
  • You may also choose to establish custom terms of use by selecting No, do not apply CC0 – Public Domain Dedication and define the custom terms in the resulting textbox.

screenshot

  • You are required to set up terms of use for all datasets you designate as restricted. You may allow users to request access to restricted datasets using the Request Access option.

For more on Licensing and Permissions, go here.

4.2. Make your Dataset Restricted

1. Access your dataset, scroll down to the list of files, select the file(s) you would like to restrict using the check boxes next to each file.

screenshot

2. Click Edit Files and select Restrict.

screenshot

3. A pop-up window will appear. Please describe the terms of access, which will inform users how and if they can gain access to the restricted files.

screenshot

4. If you want users to be able to request access, select the checkbox marked Enable Access Request. Click Continue.

screenshot

5. Your selected data files are now restricted.

4.3. Create Multiple Versions of a Dataset

Unpublished datasets are “Draft Datasets.” Once published, the dataset will be assigned to the category “Version 1.” Changes made to the dataset (see below) will result in the creation of a new version.

  1. You may choose whether the changes merit a small change to Version 1.1 or a more significant change to Version 2.0, and so on.
  2. An example of a small change to Version 1.1 includes a typo correction or a small metadata change. Whereas, a citation change or the addition of a new data column may mean choosing Version 2.0 is appropriate.
  3. The addition of a new file automatically changes the version to the next whole number. For example, if the dataset is Version 1.0 and a file is added, the system automatically creates a Version 2.0.

All versions are available to be viewed by the administrator or curator, and can be made available to the public. This includes Draft Dataset versions.

1. Click the Versions tab to view all versions and see the changes.

screenshot

2. Click Show Details in the Versions tab for more information about additions and edits.

screenshot

3. Use the Show Differences option when comparing two versions to identify particular differences.

screenshot

4.4. Deaccession a Dataset

Deaccessioning is reserved for circumstances in which there is a significant, often legal, case for the removal of public access to a dataset. To deaccession a dataset:

1. Click Edit and select Deaccession Dataset.

screenshot

2. You will have the option to deaccession particular versions or an entire dataset.

screenshot

3. You are required to include the reason for deaccessioning the dataset.

screenshot

If a user navigates to a deaccessioned dataset’s persistent URL, they will see a landing page with the citation for the dataset, but not any files or metadata that were deaccessioned.

screenshot

4.5. Turn on the Guestbook Feature.

Guestbooks allow you to collect data about who is downloading the files from your datasets within a Dataverse for which you are the administrator. You can decide to collect account information (username, given name and last name, affiliation, etc.) as well as create custom questions (e.g., What do you plan to use this data for)?). You are also able to download the data collected from the enabled guestbooks as Excel files to store and use outside of Dataverse.

1. From your Dataverse page, click on the Edit button and select Dataset Guestbook.

screenshot

2. By default, guestbooks created in the Dataverse your Dataverse is in will appear. If you do not want to use or see those guestbooks, uncheck the checkbox that says Include Guestbooks and Root Dataverse.

3. To create a new guestbook, click the Create Dataset Guestbook button on the right side of the page.

screenshot

4. Name the guestbook, determine the account information that you would like to be required (all account information fields show when someone downloads a file), and then add Custom Questions (i.e., for users to answer before being allowed to download data; can be required or not required).

screenshot

5. Hit the Create Dataset Guestbook button once you have finished.

screenshot

To assign a guestbook to a particular dataset:

1. Click the button in the Action column that says Enable. A guestbook is enabled by default upon creation.

screenshot

2. Once a guestbook has been enabled, go to Terms for a dataset and select a guestbook for it.

screenshot

screenshot

3. Select the Terms tab.

4. Click on Edit Terms Requirements.

5. Select the guestbook you intend to use.

There are also options to view, copy, edit, or delete a guestbook. Once someone has downloaded a file in a dataset where a guestbook has been assigned, an option to download collected data will appear.

4.6. Add a logo to your Dataverse.

After creating a new Dataverse:

1. Click the Edit button.

2. Click on Theme + Widgets.

screenshot

3. Use Upload Image to add a logo.

screenshot

4. Use Header Colors, Tagline, and Website to provide additional customization.

4.7. Dataset Templates

Templates are useful when you have several datasets that have the same information in multiple metadata fields that you would prefer not to have to keep manually typing in or want to use a custom set of Terms of Use and Access for multiple datasets in a dataverse. In Dataverse 4.0, templates are created at the dataverse level, can be deleted (so it does not show for future datasets), set to default (not required), or can be copied so you do not have to start over when creating a new template with similar metadata from another template. When a template is deleted, it does not impact the datasets that have used the template already.

How do you create a template?

1. Navigate to your dataverse, click on the Edit Dataverse button and select Dataset Templates.

2. Once you have clicked on Dataset Templates, you will be brought to the Dataset Templates page. On this page, you can 1) decide to use the dataset templates from your parent dataverse, 2) create a new dataset template or, 3) do both.

screenshot

3. Click on the Create Dataset Template to get started. You will see that the template is the same as the create dataset page with an additional field at the top of the page to add a name for the template.

4. After adding information into the metadata fields you have information for and clicking Save and Add Terms, you will be brought to the page where you can add custom Terms of Use and Access. If you do not need custom Terms of Use and Access, click the Save Dataset Template, and only the metadata fields will be saved.

screenshot

screenshot

5. After clicking Save Dataset Template, you will be brought back to the Manage Dataset Templates page and should see your template listed there now with the make default, edit, view, or delete options.

6. A dataverse does not have to have a default template and users can select which template they would like to use while on the Create Dataset page.

7. You can also click on the View button on the Manage Dataset Templates page to see what metadata fields have information filled in.

screenshot

*Please note that the ability to choose which metadata fields are hidden, required, or optional is done on the General Information page for the Dataverse.

5. Accessing and Evaluating Data

5.1. Download Datasets

Identify an existing dataset (one that is not your own) within the Texas Data Repository.

1. Click on the name of the dataset or on the thumbnail image to be taken to the page for the dataset.

screenshot

2. Once on a dataset page, you will see the Title, Citation, Description, and other metadata fields.

screenshot

3. Within the Files tab on the dataset page, select the file(s) that you would like to download and then click the Download button above the files. The selected files will download in zip format.

screenshot

5.2. Use Mapping and Statistical Analysis Tools

A limited number of data formats can be visualized using tools built into the Texas Data Repository interface. Use TwoRavens to visualize CSV, Rdata, and dta files. A quick guide to getting started with TwoRavens is below.

1. TwoRavens is a visual tool for manipulating statistical data in tabular format.

  • It accepts the following formats: Stata (commercial), SPSS (commercial), R (open source), and Comma Separated Values.
  • It does NOT accept Tab Separated Values.
  • RData will not be processed over 1 MB.

2. An “Explore” button will automatically appear next to the published datasets in the data list. In order for TwoRavens to successfully display the data:

  • The first row of the table must contain the data labels.
  • All data must be numeric data, formatted as numbers.
  • The data must be in one of the file formats listed above.