I just wanted to make a quick post to tell you all about the recent release of the BNC 2014. The first part of this modern update to the venerable 20 year old BNC is all spoken data, with written data to be released next year. Here's the summary from the project website:
The Spoken BNC2014 is now accessible online in full, free of charge, for research and teaching purposes. To access the corpus, you should first create a free account on Lancaster University’s CQPweb server (https://cqpweb.lancs.ac.uk/) if you do not already have one. Once registered, please visit the BNC2014 website (http://corpora.lancs.ac.uk/bnc2014) to (a) sign the corpus’ end-user licence and (b) register your CQPweb account – following the instructions on the site. When you return to CQPweb, you will have access to the Spoken BNC2014 via the link that appears in the list of ‘Present-day English’ corpora. While access is initially only via the CQPweb platform, the underlying corpus XML files and associated metadata will be available for download in Autumn 2018. The BNC2014 website also contains lots of useful information about the corpus, and in particular a downloadable manual and reference guide.
...
The audio recordings contain face-to-face conversations between people who speak British English as their first language, collected between 2012 and 2016. The recordings could be on any subject, and speakers were aware of being recorded as they conversed. In total the corpus comprises over 10 million words.
For more information, please see: http://cass.lancs.ac.uk/?page_id=1386