Synthetic data is an upcoming technology in the data industry. There are many reasons to explore synthetic data, which I will go into shortly.
Sometimes the most effective way to collect more data is to make it yourself.
A computer program manufactures synthetic data. We no longer measure real-world events. There are many methods for generating synthetic data. These methods can range from find and replace, all the way up to modern machine learning. The synthesis starts easy, but complexity rises with the complexity of our data. To use synthetic data you need domain knowledge. You need to understand what personal data is, and dependence between features. …
Python and R are the main tools in the world of data. Although they aren’t trivial to get started with, they should be. This post walks through my day 1 set up at the UK Data Service. We cover setting up any machine for Python, R and a plotting example for each. No programming knowledge needed.
The UK Data Service hosts the United Kingdom’s largest collection of social data. Through the website, you can request, manage and deposit all sorts of social data. Sharing access to that data is only the first step. …
In Manchester, we have a Community Slack which I recommend. It is a fantastic place of joyous discussion, job offers, pet projects, pets and beyond. It’s also a great place to find a job.
One day on Slack I got a phone call from a recruiter who had sent me a message minutes before which I had not yet replied to.
“Sorry, I got too excited”
Although jarring I gave them a chance to start again but it came with no apology or change of tune. Poor representation is costing the company, good applicants. From faking family illness to contacting managers to reach candidates. There seem to be no tricks some recruiters aren’t willing to use. The lack of self-awareness can be cringe-worthy. …
Recently some friends and I have been talking about starting a company together. We still aren’t sure what that looks like and our product changes every week. We are hitting some problems around trust, and how you make a product trusted enough to convert. I can’t remember how we got here but I drew the below diagram and labelled it “The Triangle of Success” in parody.
Since the (re)creation of “The Triangle of Success” out of a moment of boredom, I’ve noticed its power for intentional action. This is a parody of Maslow’s hierarchy of needs. To those unfamiliar, each layer is unobtainable without being secure in the former. …
I have been thinking about what blog post to write next. People have asked me how I bring myself to write a blog post. People have asked me how to get started once I pick a topic. So here it is, the meta “How to write a blog post” blog post.
First of all, you need to pick a topic you can discuss at length, something you could talk about for five minutes. This is the hard part. I am confident you already can write about something you know. My process is as follows.
You need a text editor. Open it up and write out a quick bullet point list of the structure of your post. I would suggest enough to prompt you if you lose flow. Here is the one I used for this…
A kata is a take-home task provided by a company as part of the interview process.
In all sorts of roles, this is a common part of the process. In Front-End it may be building a website. In Data Science it may be something like “find three insights”. In Marketing it may be “come up with a marketing strategy”, one they could steal and run away with. Whatever the kata is, it’s always free work and an expectation of best work in a small period of time. Not only is this disrespectful to the candidate’s time, but it also costs a company some of the best applicants. …
Put the work in and Twitter will work for you.
When we own a product it’s beneficial to build trust with a community. Users often check Twitter accounts to gauge the maturity of a product. It can appear jarring to follow too many people, not tweet enough or not have a nice profile picture.
Twitter is a great platform if you want somewhere where you can repeat yourself.
At a low level, you need some followers to appear valid. The number of followers you have isn’t a useful goal. The goal isn’t to hit 1000 followers or gain 25 followers this week. …
This is a sad story of how my first job in web development treated me, and what I learnt from it. Self-improvement and the mastery of skills trump anything a company can offer you.
The year is 2015. I am 20 years old and I love front end development. I spend my time freelancing around my degree. A break from academia sounds like a good idea so I start looking for industrial experience. …
Data Science roles have unrealistic expectations from blog posts and other job descriptions. If you are going to copy one I hope you pick this one!
You are probably not an expert in Data Science. That’s okay, the fact that you are reading this puts you above other companies in the search for a Data Scientist.
What is a Data Scientist?
Data Science is an ocean as deep as it is wide. A Data Scientist is somebody who can:
Telephone interviews are a way to assess a candidate early on in the interview process. Candidates pass telephone interviews often, with few filtered out by the process.
Bad interviews are often given a free pass as interviewers know they are stressful. So why do we still do them? I am not against telephone interviews. A Telephone interview determines a culture fit without a large time cost.
I am currently contracted to write up and carry out a Data Science interview process. As such the following outlay my logic and recommendation for telephone interviews.
The incessant need for an audio or video call is stressful for candidates. A candidate should never feel they are being judged on anything other than aptitude for the job. As such I recommend a move to a text-based “telephone interview”. Inviting a candidate to a Slack or Skype chat gives more flexibility to the process. If a candidate, or interviewer, has to attend to something at work or home then the interview is no longer over. This adds accountability to the process. …