Contact Me

Use the form on the right to contact me.

 

           

123 Street Avenue, City Town, 99999

(123) 555-6789

email@address.com

 

You can set your address, phone number, email and site description in the settings tab.
Link to read me page with more information.

Explorations

How Fresh is that Code?

Andrew Elliott

One of the beauties of the "R" programming language is the vitality of the user community. Language users are continuously uploading newly developed or revised versions of extension functionality. Looking at the range of packages available on CRAN, the "Comprehensive R Archive Network" I was struck by how many of these packages had recent versions resistered. So, I decided to dig a little, and at the same time give you a little flavour of quick and dirty data exploration with R. Some highlights:

Load in the package list from CRAN:

packages<- getRPackages("http://cran.r-project.org/web/packages/available_packages_by_date.html")

How many packages are in the archive?

dim(packages)[1]
## [1] 7422

Date of stalest package?

min(packages$dt)
## [1] "2005-10-29 UTC"

Date of freshest package?

max(packages$dt)
## [1] "2015-11-03 UTC"

Ooh! that's today: how many packages are fresh today?

nrow(packages[packages$dt==max(packages$dt),])
## [1] 5

And just for interest, which are they?

packages[packages$dt==max(packages$dt),c("name", "dt")]
##            name         dt
## 1      DLMtool  2015-11-03
## 2   epiDisplay  2015-11-03
## 3         MM2S  2015-11-03
## 4    quickmapr  2015-11-03
## 5  SALTSampler  2015-11-03

Ok, so let's compute the ages of the packages (in weeks). How many packages are less than 4 weeks old?

today<-max(packages$dt)
packages$age<-interval(packages$dt,today)/edays(7)
sum(packages$age<=4)
## [1] 587

Around 8%! let's look at the distribution by age - for convenience convert weeks to approximate years:

ageInYears <- packages$age / 52
hist(ageInYears, breaks=20)

More than half the packages are fresher than 1 year old; and it's easy to see that the growth took off just about 4 years ago after several years of slow burn. Let's look at the growth just over the past year (roughly 44 weeks):

freshThisYear<-packages[packages$age<=44,]$age
hist(freshThisYear, breaks=44)

I think it's clear that the takeup of R continues to accelerate, if the freshness of the user-contributed archive is any sort of guide.