Week 2
May 30th - June 3rd
This week, more data needed to be organised for the machine learning process. A different source of raw data needed to be organized and uploaded to the sql server for the machine learning process. I did these all week: building meaningful database tables, formatting and performing some conversions on the data gotten from our sources to perfectly fit our needs. I then ran queries that will load useful information to the database on the server. This week, the amount of information was greater than last weeks' given bulk of raw data from our source and the consequent increase in the number of table in our database. Due to the difference in the formatting of the files and their contents, extracting relevant information was time consuming and often called for a revision of the uploaded information after the upload. Information was therefore added and repeatedly deleted to allow a homogeneity in the data collected from last week's source and this week's. In the end of the week, we had more information on the events that occurred during each of the games of the four seasons considered . We also had more information about each team and each player involved in each game and a more detailed analysis of their input during each of their games. However, we agreed to break our current tables into more tables to allow more in-depth insight that will ease the sorting of the information and other foreseen relevant task that may be undertaken for the purpose of the research. This will be done in the beginning of next week.
Getting useful and accurate information and properly formatting this information in vital for the research as it forms the basis of all results obtained thereafter. Information gotten from this data is useful as it would constitute several experiences for the machine which will it will latter use in order to generate accurate predictions of games outcome.
Getting useful and accurate information and properly formatting this information in vital for the research as it forms the basis of all results obtained thereafter. Information gotten from this data is useful as it would constitute several experiences for the machine which will it will latter use in order to generate accurate predictions of games outcome.