Home » HACKER-TECH » How To Write A Christmas Tune The utilize of Records Science – Lynchpin

How To Write A Christmas Tune The utilize of Records Science – Lynchpin


Tim Bush – Records Scientist at Lynchpin.

Clearly, the handiest Christmas tune ever written is Fairytale of Recent York. It contains despair, heartbreak and profanity, slightly like secret Santa gift giving on the Lynchpin place of job.

Alternatively, as a ways as I know, Shane Macgowan wrote this with out the wait on of analytics or machine discovering out algorithms, so I mediate I’m in a position to doubtlessly carry out better.

To birth with I might make utilize of the tantalizing ‘Million Songs Dataset’ which is a 280Gb database of songs hosted by the University of Columbia

Million Tune Dataset | scaling MIR research

This dataset is wide and subsequently complex to catch admission to. Alternatively, an SQL database of about a of the metadata of every music is more manageable. As a naive beginning point, we are in a position to pull out tracks that catch ‘Christmas’, ‘Santa Claus’, ‘Yuletide’, ‘Reindeer’, ‘Jingle Bells’ or ‘Ho Ho Ho’ in the title. This offers us about 3136 songs to play with.

Christmas songs over time
If we admire on the allotment of complete songs which could be Christmas Songs over the previous few decades, the outcomes are slightly depressing. The allotment of complete pop songs which could be Christmas songs has been reducing for the explanation that Forties (the earliest Christmas tune is ‘White Christmas’ by Bing Crosby in 1942), and has on the total been rubbish for the explanation that Fifties.

Clearly the time is ripe for some Records Science innovation in the Christmas genre.

Which artist must document a Christmas tune?

Taking a take into fable on the Last_fm ‘hotttnesss’ (a reputation measure) of artists, we are in a position to search out out about that artists that catch released Christmas songs are more fashioned than the frequent artist (we are in a position to handiest speculate whether or no longer right here’s correlation or causation).

By manner of artist preference, the most fashioned artist on Last_fm who catch NOT but released a Christmas Tune, is Daft Punk*, so right here’s indubitably the handiest logical preference for our artist.


How lengthy must a Christmas tune be?

The glory of the longest Christmas connected tune goes to Charlie Daniels for ‘A Carolina Christmas Carol’, which is an excellent Sixteen.5 minutes of festive banter. By manner of very short songs, a basic number are by ‘The Wiggles’ from their 26 tune album, ‘A Wiggly Wiggly Christmas’. Taking a take into fable on the imply tune interval, it looks that Christmas songs are shorter than real looking.


What relating to the lyrics?

Clearly the largest share. I might make utilize of the Pure Language Toolkit (nltk) in Python to acknowledge if we are in a position to catch a theory of the lyrical direct of Christmas songs. I took the lyrics from the Honorable 50 Christmas songs in step with TimeOut magazine http://www.timeout.com/london/music/the-50-handiest-christmas-songs. For comparison, I also incorporated lyrics from basic Christmas Carols ( Extinct Christmas Carol Lyrics).

The graph beneath reveals the similarity of these varied tune sorts by lyrical direct; each datapoint represents a tune, and if they are nearer together then they are classified as being more lyrically identical (right here’s recognized as an MDS plan).

The technical details are described more exclusively on the terminate. I outmoded a clustering algorithm that we utilize at Lynchpin for buyer segmentation; in that case each datapoint can also signify the aggregate of products bought by a buyer, and the clustering could be in response to some attribute of that buyer (it will also even be in response to the outcomes of performing a identical textual direct-mining evaluation on the outcomes of a buyer look).

how-to-write-a-christmas-tune-the-utilize-of-records-science--lynchpin-hacker-tech-christmastuneclusters-show-news-business-blog--many-good-internet-thingsThere are three obvious clusters right here. Christmas Pop songs (pink and yellow stars) and Christmas Carols (pink stars) is also clearly worthy by lyrical direct. Curiously, ‘The Wassailing Tune’ by Blur is shut to being classified as a Christmas Carol (which is colorful because it is miles in response to a used Christmas tune).  The tip phrases in each of these categories are as follows:

Christmas Carols Santa Songs NO Santa Songs
King, born, Christ, come, angels, evening, elegant, singing, Lord, God, peacefully, minute, Earth, like Santa, Oh, come, bought, like, know, lengthy, minute, Christmas, let, factual, snow, trees, evening, elegant Christmas, time, minute one, factual, years, like, snow, bought, let, Oh, glad, day, like, singing, trees

Most frequently, Christmas carols are more non secular, and there are two kinds of Christmas pop tune:

  • Santa songs: Songs that mention Santa (akin to ‘Santa Claus Is Coming To Town’ and ‘Santa Claus Stir Straight To The Ghetto’)
  • No Santa songs: Songs that carry out no longer mention Santa (akin to ‘I wish it will also very properly be Christmas daily’ and ‘Closing Christmas’).

I am factual going to generate random sequences of phrases from this checklist until it spits out one thing that I mediate sounds wonderful. Alternatively what comes out is some general indication of a receipe for success. Alternatively that recipe needs refining.

Contented Snow Time 

Christmas singing years

Trees snow Christmas


glad years Oh years minute one years

minute one let time Christmas singing


(*Repeat with out terminate)

So how will we bear a tune out of this? Properly, at Lynchpin, we realize that the outcomes of analytics must be interpreted in response to your Industry Desires sooner than they catch any sense. A checklist of random phrases generated by an algorithm doesn’t catch a tune, factual like a checklist of numbers with none interpretation is not any longer a industry consequence.

So with that in thoughts, we proudly mark:

Contented Snow Time – Its Christmas Child (A NO Santa Christmas Tune by Tim Bush and performed by Daft Punk)

The trees are singing
Appropriate just like the total years

Once they bought covered with snow

On Christmas Day oh


Contented snow time
Its Christmas minute one

I factual desire to will allow you to attain

Before I met you

I was a bare tree

With out a preserving of snow

With nothing to remark for

Child never let me skedaddle


Contented snow time
Its Christmas minute one


Contented snow time
Its Christmas minute one

Subsequent twelve months

The Million Songs Dataset also involves recordsdata on the musical direct of tune, akin to tempo and time signature, it could well possible well be tantalizing to defend out a Records Science Venture relating to the musical structure of Christmas songs the usage of this recordsdata.


Technical Primary components of Tune Lyric Clustering

Here’s a general qualitative description of how I clustered songs by their lyrical direct, after which generated the plan that you just observed earlier, this used to be all carried out in Python:

  • txt recordsdata of tune tiles and tune lyrics as lists.
  • Expend TfidfVectorizer from scikit-learn which transforms textual direct into feature vectors that can then be outmoded as inputs to numerical algorithms. This step also removes phrases that are recognized as ‘Stopwords’ (on the total unimportant and listless phrases akin to ‘our’ and ‘from’). Importantly, it also stems phrases, meaning that it reduces variant forms of a be conscious to their root (e.g. cherished, loving and like will all be diminished to like).
  • Lag a k-design clustering algorithm on the outcomes.
  • Generate an MDS plan the usage of some similarity measure.
  • Colour the datapoints in the MDS plan by what cluster they had been assigned as being share of.

Technical Primary components of The Prognosis of Christmas Songs Over Time

This used to be all carried out the usage of the sqlite3 library internal python. I on the total wrote SQL queries to pull out Christmas songs, build the outcomes in a pandas dataframe, after which plotted the outcomes the usage of matplotlib.





Learn Extra

Leave a Reply

Your email address will not be published. Required fields are marked *