+ - 0:00:00
Notes for current slide

Thanks to the amazing organizers of R/Medicine! Intro self.

This is a co-presented talk with Garrick Aden-Buie; I'll hand off to him at the conclusion of my slides

Since I do not own the last slide, I'll inject a couple formalities here:

  • You're about to see some really cool slides, and I can personally take credit for effectively none of them. Garrick is the wizard.
  • We're hiring multiple positions with a specific eye out for R devs/DS over the next few months; feel free to reach out

To orient you to the origin of this talk, where Garrick and I are coming from, and how it came to be that we gain high value from an R pkg universe, I'll begin with a story which is based on Moffitt's experience but could represent any institution's data-related journey...

Notes for next slide

Build your own universe

Scale high-quality research data provisioning with R packages

Travis Gerke @travisgerke
Garrick Aden-Buie @grrrck
Moffitt Cancer Center
August 28, 2020
1

Thanks to the amazing organizers of R/Medicine! Intro self.

This is a co-presented talk with Garrick Aden-Buie; I'll hand off to him at the conclusion of my slides

Since I do not own the last slide, I'll inject a couple formalities here:

  • You're about to see some really cool slides, and I can personally take credit for effectively none of them. Garrick is the wizard.
  • We're hiring multiple positions with a specific eye out for R devs/DS over the next few months; feel free to reach out

To orient you to the origin of this talk, where Garrick and I are coming from, and how it came to be that we gain high value from an R pkg universe, I'll begin with a story which is based on Moffitt's experience but could represent any institution's data-related journey...

Once upon a time, our organization conducted all data-related business in an amorphous cloud known as the IT department.

This is a common paradigm for many healthcare organizations in early stages of data maturity.

The IT dept had many roles.

Hospital operations needed dashboards for planning purposes

Researchers needed patient or biospecimen data for their IRB-approved protocols

Somebody needed to know about data lineage as well as coding or metadata standards; basically, how the data got here why it looks like it does

Organizing databases into a warehouse and granting access was important. We had someone for that!

But eventually, some of these teams were operating at a scale which would be better situated as independent of the IT gravity field

One of these, was business intelligence. That person who was making dashboards for operational, i.e. non-research stakeholders, is now part of a larger team that creates such products at scale

Next, our research-focused stakeholders had many of the same needs as the operational end-users, such as reporting/dashboarding and, importantly data provisioning. The twist in the research space is that such activities must be conducted in accordance with IRB and ethical approval, and study design feasibility as it relates to data availability and structure requires specialized training. Hence, the CDS team was formed. This is one of the groups that Garrick and I are representing today.

But CDS can't operate at scale in a vacuum either. A critical and complementary team, Data Quality and Standards, formed from IT's "data historian." They ensure that data dictionaries are robust and data lineage is understood by the BI and CDS teams for appropriate downstream data usage.

As institutional data assets grew, warehousing and access rules became necessarily complex. Data engineering formed a new continent within IT to meet the challenge.

Now, with so many teams completing data-related operations at a rapid pace, we needed a shuttlecraft to coordinate technology strategy, inform general data governance, and mine valuable software ore from the astRoid belt

When those tools are ready for placement and maintenance in the institutionally supported production environment, the new Applications Development land mass within IT can help out. For example, they would maintain software such as RStudio Server or GitHub Enterprise.

This whole story, admittedly with some shortcuts for clarity, mirrors the rise of the Chief Data Officer role across the healthcare industry. Indeed, all of these groups tend to roll up or be horizontally aligned in some way with the CDO's vertical.

Scaling provisioning
by scaling people

16

Taken together, this is our first hint that scaling data provisioning isn't just about scaling data: it's about scaling the people who are doing the provisioning. In part 2, Garrick is going to tell you more about the "how"

Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
oTile View: Overview of Slides
Esc Back to slideshow