Discussion and Workshop Descriptions

Day 1 Discussions

Big data

Want to hear about how some groups are dealing with big data? We'll talk about an example of constantly streamed data from loggers as well as big data from satellites. Maybe you want to learn about resources available on campus for data storage, compute, sharing, and how to get access to them? Let's talk about what big data means to you and your problems, we'll help manage your big data.

Facilitator: TBD

Biological data (field samples, lab samples, etc)

Join us to discuss issues that come up with managing, storing, sharing, and archiving biological data, such as field samples and lab samples. What are the various ways that data can be collected, from instruments gathering data to using paper in the field? Is there a need to preserve physical samples as well as metadata about the samples? We’ll talk about existing resources and how to describe and preserve your data.

Facilitators: Katie Wilson, Julie Kelly, Libraries

Copyright, licenses, and data "ownership"

People sure do talk about who "owns" data, and mean a lot of different things by it. It's not usually copyrightable in the US, but it can be sometimes, and often is in other countries. There can also be contracts, licenses, and other legal issues involved in who gets to control and/or use data. Whether you're concerned about the data you're producing yourself, or someone else's data (or other media) you want to use, let's talk.

Facilitator: Nancy Sims, Libraries

Electronic Lab Notebooks and other tools to manage data and notes

Researchers in all disciplines increasingly create and manage large amounts of data and concerns about the reproducibility of research has caused an increased emphasis on maintaining good records of what you did, when, and why. Come discuss tools to effectively manage the record of your research, from off-the-shelf tools like Evernote, to online platforms meant to improve science like the Open Science Framework, to specialized Electronic Lab Notebooks.

Facilitator: Meghan Lafferty, Libraries

Geospatial data

Mapping and spatial analysis can be helpful in nearly every discipline. U-Spatial provides support for spatial research. This discussion will introduce the resources available at the University for working with geographic information systems (GIS), remote sensing, visualizations, and spatial computing. There are amazing free resources including software, spatial data, training and a help desk to answer your questions.

Facilitator: Melinda Kernik, Libraries

Human subjects data

Join us for a discussion on issues that come up with managing, storing, sharing, and archiving human subjects data. What language do you need in your IRB if you might need to share or archive your data later? How should you store sensitive or identifying data? What are options for sharing human subjects research data? How do you de-identify this data? What are the considerations for sharing interview or other qualitative human subjects data?

Facilitators: Alicia Hofelich Mohr, PhD, LATIS, Sarah Jane Brown, Libraries

Images, Texts, Audio, & Video

Do your colleagues make fun of your data for not being data? Does your research produce files that look more like art than numbers? Well, your research products are data, too, and this session will cover the special considerations of managing images, texts, and other multimedia files. We will cover topics such as file naming and formats, copyright, de-identification, analysis, sharing, and organizing.

Facilitators: Lois Hendrickson, Amanda Wick, Benjamin Wiggins, Libraries

Manage citations and PDFs

Research can be messy. One way way to avoid this is by being intentional about organizing your citations and full-text documents. Citation managers can help you export documents and metadata from various platforms (e.g., Google Scholar, PsycINFO), organize into folders and subfolders, categorize (e.g., tags), create bibliographies in a variety of styles, and more!

Facilitators: Hayley Coble, Kristen Mastel, Wanda Marsolek, Libraries

Qualitative materials/data

The source materials for qualitative research take many forms-- archival materials, fieldnotes, interview transcripts, images and recordings. This discussion will cover different strategies for the organization and documentation of qualitative research materials. It will also introduce participants to thinking about their management and documentation in terms of qualitative workflows--from source organization, to thematic structures and creating codebooks, activity logging, and data export for preservation.

Facilitators: Mike Beckstrand, PhD, LATIS, Shanda Hunt, Libraries

Storage/backup of data

Whether you are collecting data or gathering secondary sources, you need to put your files somewhere. Some research projects can be stored on a hard-drive, but what if your research has too much data for one computer, or has additional privacy concerns or collaboration requirements? We’ll talk about University resources for data storage and protection against data loss, and talk about managing storage for digital files during and after a project.

Facilitator: Valerie Collins, Libraries

Day 2 Workshops

Atlas.ti/NVivo

Atlas.ti and NVivo are qualitative data management, coding and markup tools, that facilitate powerful querying and exploration of source materials for both mixed methods and qualitative analysis. We'll do a quick overview of the management, coding, and analysis features of these two suites of qualitative data analysis software and compare their relative similarities and differences. No experience necessary--come to learn about how these tools can improve upon your pencil-highlighter-and-paper workflows!

Presenter: Mike Beckstrand, PhD, LATIS

Citation Managers

Research can be messy. One way way to avoid this is to organize your citations and full-text documents. Citation managers can help you export documents and metadata from various platforms (e.g., Google Scholar, PsycINFO), organize into folders, categorize, and create bibliographies in many styles. Explore Mendeley and Zotero and get hands-on experience with good file management. Please bring a laptop.

Presenters: Meghan Lafferty, Hayley Coble, and Kristen Mastel, Libraries

Control all your versions with git and GitHub!

Come learn the benefits of git version control for academic research. git and GitHub are well-established tools in the world of software development, and are now making their way into reproducible research workflows. We'll practice creating a repository to store your research code/analyses and track changes as you progress through a project. Are you totally new to git? Or maybe you've tried git before but just didn't "git" it? This session will have something for everyone!

Presenter: Colin McFadden, LATIS

Excel Pivot Tables for Reproducible Research

Got Excel data in your life? If you're trying to make sense of it all, Excel Pivot Tables can be your friend. Pivot Tables are a convenient tool for quickly summarizing and exploring tabular data. This session is meant for those new to Excel Pivot Tables or for those looking for a refresher. We'll create a Pivot Table with some test data, and discuss the ways in which Pivot Tables can simplify your life. We will also cover some issues to watch out for when using Excel to store your data.

Presenters: Valerie Collins, Allison Langham, Libraries

Introduction to Python

Python is popular in academic research because it is a powerful but easy-to-learn programming language. This workshop covers the pros and cons of using Python for research computing, the fundamental building blocks and basic grammar of a Python program, and how the basic libraries of the scientific Python stack can be connected to create robust, reproducible analyses. For grad students and others new to Python. Some previous programming experience (e.g. MATLAB or R) will be helpful.

Presenter: Kelly Thompson, Libraries

Introduction to R

Heard of R but have no idea what it's all about? Come learn the basics of R, including how to use RStudio, how to write a script, load a working directory, read in, explore, and save data files in R. No experience using R is assumed. Please bring a computer and install R and RStudio.

Presenter: Frank Sayre, Libraries

Learning R Markdown & Dplyr

This workshop will introduce reproducible report generation using R markdown. R markdown is a special type of R-script that allows users to create dynamic reports. Such reports integrate r-code, data, figures, and tables with text that describe your analysis. Using this tool, it is possible to create pdf, html, word documents, and other output formats all within a reproducible workflow. In addition, you will learn about dplyr, which is a popular and powerful r-package used to perform data manipulation. Together, these two tools will help to make you more efficient and systematic. Basic knowledge of R is useful. Bring a laptop with the most recent version of R and RStudio.

Presenter: Alicia Hofelich Mohr, PhD, LATIS

Python for Reproducible Workflows

This session will focus on discussing Python's place in the pantheon of data management tools—particularly the ease of using Python to create reproducible workflows and script complex data transformation. After examining the pros and cons of Python vs other options, one or two case studies of how LATIS has used Python to manage their client's research will be presented.

Presenter: David Olsen, LATIS

Qualtrics

Qualtrics is a versatile data collection tool for a wide range of survey and experiment needs. However, finding the right bells and whistles when using this tool for your research can be daunting. We will give a brief overview of the tool and good practices on how to develop reproducible workflows so that your future self loves what you did. No experience in Qualtrics is necessary but please set up your free UMN Qualtrics account prior to the workshop (https://it.umn.edu/technology/qualtrics).

Presenter: Andy Sell, LATIS