Machine Learning Algorithms Problem Types

No Comments

Types of problems we can solve with machine learning:

  • Regression- helps establish a relationship between one or more sets of data

    • Algorithms
      • Simple linear regression
      • Multiple Linear Regression
      • Polynomial Regression
      • Support Vector Machines (SVR)
      • Decision Tree
      • Random Forest Regression
    • Sample problem: calculate the time I get to work based on the route I take and the day of the week
  • Classification – helps us answer a yes/no type of question based on one or more sets of data

    • Algorithms
      • K Nearest Neighbors (KNN)
      • Kernel SVM
      • Logistic Regression
      • Naïve Bayes
      • Decision Tree
      • Random Forest Classification
    • Sample problem: will I be late or on time based on the route I take and the day of the week
  • Clustering – helps us discover clusters of data

    • Algorithms
      • Hierarchical Clustering
      • K Means
    • Sample problem: classify the customers into specific groups based on their income and spending
  • Association – helps determine an association among multiple events

    • Algorithms
      • Apriori
      • Eclat
    • Sample problem: if I like movie A, what other movies will likely to enjoy
  • Reinforcement – helps to better exploit while exploring

    • Algorithms
      • Thomson Sampling
      • UCB
    • Sample problem: we want to determine the most effective treatment. Instead of conduction a long-term random trial, use UCB or Thompson Sampling to determine the best treatment in a shorter interval
  • Natural Language Processing

    • Algorithms
      • Any classification algorithm, but most popular are Naïve Bayes and Random Forest
    • Sample problem: determine if an amazon review is positive or negative
  • Deep Learning – can help determine hard to establish non-linear relationships between multiple input parameters and some expected outcome

    • Algorithms
      • Artificial Neural Networks (ANN)
      • Convolutional Neural Networks (CNN) – especially helpful when processing images
    • Sample problem: based on the credit score, age, balance, salary, tenure… determine if a customer is likely to continue using your service or leave
Categories: Data Science

Checking/Cleaning Disk Space on Linux

No Comments

Check the disk space (may need to install ncdu first):

sudo ncdu /

Clean up unused stuff:

sudo apt-get clean
sudo apt-get autoclean
sudo apt-get autoremove

clean: clean clears out the local repository of retrieved package files. It removes everything but the lock file from /var/cache/apt/archives/ and /var/cache/apt/archives/partial/. When APT is used as a dselect(1) method, clean is run automatically. Those who do not use dselect will likely want to run apt-get clean from time to time to free up disk space.

autoclean: Like clean, autoclean clears out the local repository of retrieved package files. The difference is that it only removes package files that can no longer be downloaded, and are largely useless. This allows a cache to be maintained over a long period without it growing out of control. The configuration option APT::Clean-Installed will prevent installed packages from being erased if it is set to off.

autoremove: is used to remove packages that were automatically installed to satisfy dependencies for some package and that are no more needed.

See a related question on askubuntu: https://askubuntu.com/questions/3167/what-is-difference-between-the-options-autoclean-autoremove-and-clean

Categories: Linux