From 7e658e2520d696c2050e89e984401326b6b54059 Mon Sep 17 00:00:00 2001 From: Vicky Steeves Date: Mon, 28 Aug 2017 17:07:26 -0400 Subject: [PATCH] updated readme and resume, added august blog --- README.md | 42 +++++++++++++- posts/2017-08-25.html | 125 ++++++++++++++++++++++++++++++++++++++++++ stories/resume.html | 62 +++++++++++---------- 3 files changed, 197 insertions(+), 32 deletions(-) create mode 100644 posts/2017-08-25.html diff --git a/README.md b/README.md index b47bd1b..a165dff 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,44 @@ -[![forthebadge](http://forthebadge.com/images/badges/contains-cat-gifs.svg)](http://forthebadge.com) [![Build Status](https://travis-ci.org/VickySteeves/personal-website.svg?branch=master)](https://travis-ci.org/VickySteeves/personal-website) +[![forthebadge](http://forthebadge.com/images/badges/60-percent-of-the-time-works-every-time.svg)](http://forthebadge.com) +[![forthebadge](http://forthebadge.com/images/badges/contains-cat-gifs.svg)](http://forthebadge.com) -### [vickysteeves.com](http://vickysteeves.com) +### About +My website, [vickysteeves.com](http://vickysteeves.com), upgraded from coding-by-hand (n00b) to [Nikola](https://getnikola.com/), a static site generator. -My website, recently upgraded from coding-by-hand (n00b) to [Nikola](https://getnikola.com/), a static site generator. +### Building +This site relies on Python and [Nikola](https://getnikola.com/), a static site generator. + +I would recommend you use a virtualenv to build and view this website. This is a Python tool to create isolated Python environments. The HitchHiker's Guide to Python has a [great guide](http://docs.python-guide.org/en/latest/dev/virtualenvs/) on virtual environments that I used to learn how to use/interact with virtualenvs. + +Here's how to make and activate a virtual environment: +
# install the tool virtualenv
+$ pip install virtualenv
+
+# create the Python 3 virtual environment
+$ virtualenv -p python3 my-website
+
+# activate the virtual environment
+$ source my-website/bin/activate
+
+ +Now, you can get started and install all of the dependecies of my website! + +
# install the dependencies
+$ pip install Nikola['extras']
+
+# clone this repo
+$ git clone git@gitlab.com:VickySteeves/personal-website.git
+
+# change directory (cd) so you are in the right folder for the website
+$ cd personal-website
+
+# build the website
+$ nikola build
+
+# see the website
+$ nikola serve -b
+
+ +You should now be able to see and interact with my website locally! ### RSS Feed Found here: [http://vickysteeves.com/rss.xml](http://vickysteeves.com/rss.xml) diff --git a/posts/2017-08-25.html b/posts/2017-08-25.html new file mode 100644 index 0000000..b45b8cb --- /dev/null +++ b/posts/2017-08-25.html @@ -0,0 +1,125 @@ + + + + + + + + +

See this post on GitLab's blog here.

+ +

NYU reproducibility librarian Vicky Steeves shares why GitLab is her choice for ongoing collaborative research, and how it can help overcome challenges with sharing code in academia.

+ +

GitLab is a great platform for active, ongoing, collaborative research. It enables folks to work together easily and share that work in the open. This is especially poignant given the problems in sharing code in academia, across time and people.

+ + + + phd-code-comic + +

It's no surprise that GitLab, a platform for collaborative coding and Git repository hosting, has features for reproducibility that researchers can leverage for their own and their communities' benefit.

+ +

What exactly is reproducibility?

+ +

Reproducibility is a core component in a variety of work, from software engineering to research. For software engineers, the ability to reproduce errors or functionality is key to development. For researchers, reproducibility is about independent verification of results/methods, to build on top of previous work, and to increase the impact, visibility, and quality of research. Y'know. That Sir Isaac Newton quote in every reproducibility presentation ever: "If I have seen further, it is by standing on the shoulders of giants."

+ +

Like all things, reproducibility exists on a spectrum. I like Stodden et al's definitions from the 2013 ICERM report, so I'll use those:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ICERM Report DefinitionsPotential Real-World Examples
Reviewable Research: Sufficient detail for peer review and assessmentThe code and data are openly available
Replicable Research: Tools are available to duplicate the author's results using their dataThe tools (software) used in the analysis are freely available for others to confirm results
Confirmable Research: Main conclusions can be attained independently without author's softwareOthers can reach the conclusion using similar tools, not necessarily the same as the author, or on a different operating system
Auditable Research: Process and tools archived such that it can be defended later if necessaryThe tools, environment, data, and code are put into a preservation-ready format
Open/Reproducible Research: Auditable research made openly availableEverything above is made available in a repository for others to examine and use
+ +   + +

The last bullet there is the goal – open and reproducible research. Releasing code and data are key to open research, but not necessarily enough for reproducibility. This is where the concept of computational reproducibility becomes important, where whole environments are captured. You could also look at it this way:

+ + reproducibility-pyramid + +

How can GitLab help?

+ +

There are a few solutions out there, including containers (such as Docker or Singularity) for active research, and o2r and ReproZip for capturing and reproducing completed research. For this post, I'm going to focus on active research and containers.

+ +

I like GitLab for research reproducibility because it makes working together simple, and seamless. There's no hacking together 100 different third-party services. GitLab has hosting, LFS, and integrated Continuous Integration for free, for both public and private repositories! Everything is integrated in a single GitLab repository which, if made publicly available, can enable secondary users to reproduce results in a more streamlined fashion. You can also keep these private to a group – you control the visibility of everything in one repository in one place, as opposed to updating permissions across multiple services.

+ +

There are a few key features that set GitLab apart when it comes to containers and reproducibility. The first is that GitLab doesn't use a third-party service for continuous integration. It's shipped with CI runners which can use Docker images from GitLab's registry. Basically, you can use the Docker Container Registry, a secure, private Docker registry, to choose a container that GitLab CI uses to run each job in a separate and isolated container.

+ +

gitlab-ci-repro

+ +

If you don't feel like using the GitLab registry, you can also use images from DockerHub or a custom Docker container you're already using locally. These can be integrated with GitLab CI, and if made public, any secondary users can use it as well!

+ +

Let's look at an example

+ +

This process is set up in a single file, a .gitlab-ci.yml. Another feature that makes my life easier – GitLab can syntax-check the CI config files! The .gitlab-ci.yml file describes the pipelines and stages, each of which has a different function and can have its own tags, produce its own artifacts, and reuse artifacts from other stages. These stages can also run in parallel if needed. Here's an example of what a basic config file looks like with R:

+ +
image: jangorecki/r-base-dev
+	test:
+	  script:
+		- R CMD build . --no-build-vignettes --no-manual
+		- PKG_FILE_NAME=$(ls -1t *.tar.gz | head -n 1)
+		- R CMD check "${PKG_FILE_NAME}" --no-build-vignettes --no-manual --as-cran
+	

And here's an example of building a website using the GitLab and the static site generator, Nikola:

+
image: registry.gitlab.com/paddy-hack/nikola:7.8.7
+	test:
+	  script:
+	  - nikola build
+	  except:
+	  - master
+
+	pages:
+	  script:
+		- nikola build
+	  artifacts:
+		paths:
+		- public
+	  only:
+	  - master
+	
+ +

It's also worth noting that you can use different containers per step in your workflow, if you outline it in your .gitlab-ci.yml. If your data collection script runs in one environment but your analysis script needs another, that's perfectly fine using GitLab, and others have the information to reproduce it easily! Another feature that puts GitLab apart is that a build of one project can trigger a build of another – AKA, multi-project pipelines. For those of you working with big data, you can automatically spin up and down VMs to make sure your builds get processed immediately with GitLab's CI as well.

+ +

Here are some other great resources and examples of using GitLab to make research more reproducible:

+ + + + diff --git a/stories/resume.html b/stories/resume.html index b3f4288..ce09df2 100644 --- a/stories/resume.html +++ b/stories/resume.html @@ -12,10 +12,11 @@
@@ -36,15 +37,15 @@
    -
  • Simmons College, Boston, MA, USA:
  • +
  • Simmons College, Boston, MA, USA:
    • -
    • Master of Library and Information Science, August 2014 +
    • Master of Library and Information Science, August 2014
      • GPA: 3.85
    • Research Opportunities
      • Small World Project:
        • Research done accompanying Dr. Kathy Wisser, March-June 2014
        • Research completed. I provided software analysis using Gephi, a data visualization software, on researchers' social network analysis of historical relationships between literary figures.
      -
    • Bachelor of Science in Computer Science and Information Technology, May 2013
    • +
    • Bachelor of Science in Computer Science and Information Technology, May 2013
      • GPA: 3.75
      • Honours Thesis: Computational Linguistic Approach to Inflection in Human Speech and Difference in Meaning
        @@ -80,9 +81,6 @@
      • Simmons College 3-1 Program, where students complete an undergraduate degree in three years and a master's degree in one year, first participant; 2010-present
      • Simmons College Dean's List, 2010-2013
      • United Pilgrimage for Youth, sponsored participant; Summer 2009
- -
  • Hamilton Wenham Regional High School, Hamilton, MA, USA:
  • -
    • High School Diploma, June 2010
    @@ -149,10 +147,19 @@
    - +
    +
    + + +
    +
    @@ -214,9 +217,10 @@

    Professional Output