Vishal Bakshi
Friday, August 28, 2020
I will start Chapter 2 by filling out the questionnaire, before reading the text or watching the lesson video.
Q1. Provide an example of where the bear classification model might work poorly in production, due to structural or style differences in the training data.
Q2. Where do text models currently have a major deficiency?
Q3. What are possible negative societal implications of text generation models?
Q4. In situations where a model might make mistakes, and those mistakes could be harmful, what is a good alternative to automating a process?
Q5. What kind of tabular data is deep learning particularly good at?
Q6. What's a key downside of directly using a deep learning model for recommendation systems?
Q7. What are the steps of the Drivetrain Approach?
Q8. How do the steps of the Drivetrain Approach map to a recommendation system?
Q9. Create an image recognition model using data you curate, and deploy it on the web.
Q10. What is DataLoaders
?
DataLoader
objects: one with training data and one with validation data.from fastai.vision.all import *
Q11. What four things do we need to tell fastai to create DataLoaders
?
path
to the data, the filenames fnames
, how to get labels using label_func
and the batch size bs
doc(ImageDataLoaders.from_name_func)
doc(DataBlock)
Q12. What does the splitter
parameter to DataBlock
do?
splitter
seems to be a function that takes the full dataset and splits it into subsets, which are then wrapped into a Datasets
object and returned on a DataLoaders.datasets
call.Q13. How do we ensure a random split always gives the same validation set?
np.random.seed(int)
Q14. What letters are often used to signify the independent and dependent variables?
x
(independent)y
(dependent)Q15. What's the difference between the crop, pad, and squish resize approaches? When might you choose one over the others?
size
is chosen for the trainingsize
, then fills the gap between the original image and the area of size
with a reflection of the image (by default) with other options (like padding with zeroes which keeps a black border around the image).size
which will change the proportions of the imageQ16. What is data augmentation? Why is it needed?
Q17. What is the difference between item_tfms
and batch_tfms
?
item_tfms
: a transform applied to a single imagebatch_tfms
: a transform applied to a batchQ18. What is a confusion matrix?
Q19. What does export
save?
export
saves a .pkl
file of the modelQ20. What is it called when we use a model for getting predictions, instead of training?
Q21. What are IPython widgets?
Q22. When might you want to use CPU for deployment? When might GPU be better?
Q23. What are the downsides of deploying your app to a server, instead of to a client (or edge) device such as a phone or PC?
Q24. What are three examples of problems that could occur when rolling out a bear warning system in practice?
Q25. What is "out-of-domain data"?
Q26. What is "domain shift"?
Q27. What are the three steps in the deployment process?