HYDROSAVER22 Jan - 19 Feb
The Newcrest Crowd
Welcome to the first online competition of The Newcrest Crowd.
This one month, global online competition invites data scientists and innovators from around the world to develop a prediction model for tailings density (and therefore water consumption) in Newcrest's gold processing operations.
The Newcrest Crowd is a new and ongoing program to work with the best innovators worldwide and identify, develop and implement new solutions for real-world, multi-million dollar applications. This challenge is the first of an ongoing series of challenges that will open real opportunities in one of the worlds largest industries.
The global resources sector faces close to $2 trillion of impact in the next 10 years from new technologies. This impact will be driven by individuals skilled in data science, software development, and hardware engineering.
People with curiosity and drive,
people who love a challenge,
people like you!
Newcrest is one of the world’s largest gold mining companies and operates mines in four countries.
We focus on long-term value creation with an emphasis on three key value drivers: maintaining low costs, growing reserves and production and using capital efficiently.
Newcrest's mission is to deliver superior returns from finding, developing and operating gold/copper mines. Our vision is to be the Miner of choice™. We will lead the way in safe, responsible, efficient and profitable mining.
Who can join?
Anyone is invited to participate in this competition. Whether you are working on this alone, as a team, are a student, professional, startup or established business, everyone has the same chance.
You need motivation, determination and a willingness to explore a new industry, as well as in-depth data science knowledge.
22nd December 17 - Competition Announced
22nd January 18, 2:00 pm - Competition starts and submissions open
19th February 18, 11:59 pm - Submissions close
26th February 18, 2:00 pm - Winners Announced
All dates and time in Australian Eastern Standard Time (AEST / UTC +10). Please remember this when making your final submission.
The goal is to predict the % solids of underflow 3 hours from now. This value is in the column labelled "target". The competition is scored using the Root Mean Squared Error (RMSE) metric.
The winner will be the entry that has the best RMSE as measured on the private test set and has not violated any competition rules.
A sample submission is provided in the sample_submission.csv. This csv must have 439140 rows plus a header row, labelled target. Each row must contain numeric values. Note that some of the values of the target variable in the test data set are missing, these rows will not contribute to the final score, you still need to predict a numeric value for these timesteps, but these values will not contribute to the RMSE error.
Extra Rule - respect causality:
To be eligible for a prize your solution must not require knowledge of the values in future rows. This means that your solution should not rely on the use of future processing plant data observations when attempting to predict the target variable at the current timestep. An unshared holdout dataset will be used to determine if the winning solution violates causality. In this final round, your submitted code will be run multiple times, and will only have access to historic data after it’s prediction for that time step has already been recorded. If your model’s accuracy performs more than 10% worse in this setting, your model will be disqualified.
Each team can submit a one-page slide with their top process insights they found during the challenge. The team at Newcrest will then determine the top submission and award the prize.