Machine leaning involves analyzing large sets of data to look for trends or correlations, and to use that to help characterize new observations and, in some cases, to perform tasks.
Using machine learning in research is like using other new tools in the practice of science.
In this article, I would be articulating the machine learning laboratory protocols used in medical imaging diagnosis.
Introduction:
This section is where you define what you are working on, the available statistics, previous methods or traditional methods used in working on this kind of project, the kinds of data that is usually associated with this project.
How machine learning comes into play in this project, interested parties, any regulatory body that may be interested in this project or institutes that partner/partnering in the project.
Data preparation:
This section talks about the source of data, storage locations, the kind of data format the data is stored in, any extra precaution or step taking to be able to retrieve the data into a workable format. Data accessibility is also discussed in this section.
Dataset analysis:
This section talks about what the dataset contains like the number of samples, demographic information. basically anything needed to know more about the dataset. This section shows if there are missing or imbalanced data, what was done to balance or fill the missing data and the final dataset.
In summary, this section shows the analysis of data, feature engineering and extraction processes or transformation.
Modeling:
This section is where you highlight the environment used to train the model, algorithms used to create the model for predictions. In the event multiple algorithms were used, this is where you highlight those algorithms, your results and comparison.
If there was overfitting or underfitting, you talk about those, what must have caused them and steps taken to mitigate these errors. What optimizations or hyperparameters tunings were done and their end results in comparison to the previous results.
You can also visualize your predictions both correct and incorrect predictions using different formats including using confusion matrix and graphs.
Explanation:
After predictions, explanations to why those predictions were made is important. You can highlight this reason by analyzing the predictions to gain insights to the features used. Then the impact of these features are analyzed both on sample data and global data.
Plots can be used to visualize the feature/features that contributed most to the predictions.
Takeaways:
Collaboration with domain experts: This talks about any partnership or collaboration to the project, what their role/roles were in the project. the accuracy of prediction, what these domain experts did or contributed to optimizing the model to achieve a higher accuracy.
If this project is your personal project, you talk about the ground truth/threshold that has been set on this particular project. If there is a research paper, you can cite it and also state if you improved the threshold or it was the same as the paper.
Using the right tooling: What tools were used in this project, how did these tools help in your project all round.
Interpretability and explainability: This is where you state the importance of this project in relation to the sector you are working where the project is applicable.
Conclusion: What approaches were used to work on this project, the improvements that were made, what was achieved in this project.
Conclusion:
This is a summary of the use machine learning in laboratory based projects or research. All recommendations, suggestions or corrections are highly appreciated. I can be reached via LinkedIn . Thank you for reading.