Learn Python by Building Data Science Applications

Learn Python by Building Data Science Applications

Read it now on the O’Reilly learning platform with a 10-day free trial.

O’Reilly members get unlimited access to books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Book description

Understand the constructs of the Python programming language and use them to build data science projects

Key Features

Book Description

Python is the most widely used programming language for building data science applications. Complete with step-by-step instructions, this book contains easy-to-follow tutorials to help you learn Python and develop real-world data science projects. The "secret sauce" of the book is its curated list of topics and solutions, put together using a range of real-world projects, covering initial data collection, data analysis, and production.

This Python book starts by taking you through the basics of programming, right from variables and data types to classes and functions. You'll learn how to write idiomatic code and test and debug it, and discover how you can create packages or use the range of built-in ones. You'll also be introduced to the extensive ecosystem of Python data science packages, including NumPy, Pandas, scikit-learn, Altair, and Datashader. Furthermore, you'll be able to perform data analysis, train models, and interpret and communicate the results. Finally, you'll get to grips with structuring and scheduling scripts using Luigi and sharing your machine learning models with the world as a microservice.

By the end of the book, you'll have learned not only how to implement Python in data science projects, but also how to maintain and design them to meet high programming standards.

What you will learn

Who this book is for

If you want to learn Python or data science in a fun and engaging way, this book is for you. You'll also find this book useful if you're a high school student, researcher, analyst, or anyone with little or no coding experience with an interest in the subject and courage to learn, fail, and learn from failing. A basic understanding of how computers work will be useful.

Show and hide more Table of contents Product information

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. Learn Python by Building Data Science Applications
    1. Why subscribe?
    1. About the authors
    2. About the reviewers
    3. Packt is searching for authors like you
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Code in Action
      4. Conventions used
      1. Reviews
      1. Technical requirements
      2. Installing Python
      3. Downloading materials for running the code
        1. Installing Python packages
        1. The VS Code interface
        1. Notebooks
        2. The Jupyter interface
        1. Technical requirements
        2. Assigning variables
        3. Naming the variable 
        4. Understanding data types
          1. Floats and integers
            1. Operations with self-assignment
            2. Order of execution
            1. Formatting
              1. Format method
              2. F-strings
              3. Legacy formatting
              4. Formatting mini-language
              1. Logical operators
              1. Technical requirements
              2. Understanding a function
                1. Interface functions
                  1. The input function
                  2. The eval function
                  1. The help function
                  2. The type function
                  3. The isinstance function
                  4. dir
                  1. abs
                  2. The round function
                  1. The len function
                  2. The sorted function
                  3. The range function
                  4. The all and any functions
                  5. The max, min, and sum functions
                  1. Default values
                  2. Var-positional and var-keyword
                  3. Docstrings
                  4. Type annotations
                  1. Technical requirements
                  2. What are data structures?
                    1. Lists
                    2. Slicing
                    3. Tuples
                    4. Immutability
                    5. Dictionaries
                    6. Sets
                    1. frozenset
                    2. defaultdict
                    3. Counter
                    4. Queue
                    5. deque
                    6. namedtuple
                    7. Enumerations
                    1. The sum, max, and min functions
                    2. The all and any functions
                    3. The zip function
                    4. The map, filter, and reduce functions
                    1. Technical requirements
                    2. Understanding if, else, and elif statements
                      1. Inline if statements
                      2. Using if in a comprehension
                      1. The for loop
                      2. itertools
                        1. cycle
                        2. chain
                        3. product
                        1. Exceptions
                        2. try/except
                        3. try/except/finally
                        1. Technical requirements
                        2. Geocoding as a service
                        3. Learning about web APIs
                          1. Working with HTTPS
                          1. The requests library
                          2. Starting to code
                          1. Geocoding the addresses
                          1. Technical requirements
                          2. When there is no API
                            1. HTML in a nutshell
                            2. Scraping with Beautiful Soup 4
                            3. CSS and XPath selectors
                              1. Developer console
                              1. Step 1 – Scraping the list of battles
                                1. Unordered list
                                1. Key information
                                2. Additional information
                                1. Technical requirements
                                2. Understanding classes
                                  1. Special (dunder) methods
                                    1. __init__
                                    2. __repr__ and __str__ 
                                    3. Arithmetical and logical operations
                                    4. Equality/relationship methods
                                    5. __len__
                                    6. __getitem__
                                    7. __class__
                                    1. Writing the base classes
                                    2. Writing the Island class
                                    3. Herbivore haven
                                    4. Harsh islands
                                    5. Visualization
                                    1. Technical requirements
                                    2. Shell
                                      1. Pipes
                                      2. Executing Python scripts
                                      3. Command-line interface
                                      1. Concept
                                      2. GitHub
                                      3. Practical example
                                      4. gitignore
                                      1. Conda for virtual environments
                                      2. Conda and Jupyter
                                      1. Technical requirements
                                      2. Introducing Python for data science
                                      3. Exploring NumPy
                                      4. Beginning with pandas
                                      5. Trying SciPy and scikit-learn
                                      6. Understanding Jupyter
                                      7. Summary
                                      8. Questions
                                      1. Technical requirements
                                      2. Getting started with pandas
                                        1. Selection – by columns, indices, or both
                                        2. Masking
                                        3. Data types and data conversion
                                        4. Math
                                        5. Merging
                                        1. Initial exploration
                                        2. Defining the scope of work to be done
                                        1. Geocoding
                                        1. Multilevel slicing
                                        1. Technical requirements
                                        2. Exploring the dataset
                                          1. Descriptive statistics
                                          2. Data visualization with matplotlib (and its pandas interface)
                                          3. Aggregating the data to calculate summary statistics 
                                            1. Resampling
                                            1. Drawing maps with Altair
                                            2. Storing the Altair chart
                                            1. Technical requirements
                                            2. Understanding the basics of ML
                                              1. Exploring unsupervised learning
                                              2. Moving on to supervised learning
                                                1. k-nearest neighbors
                                                2. Linear regression
                                                3. Decision trees
                                                1. Technical requirements
                                                2. Understanding cross-validation
                                                3. Exploring feature engineering
                                                  1. Failed attempts
                                                  1. Using a random forest model
                                                  1. Starting with data
                                                  2. Adding code to the equation
                                                  3. Metrics
                                                  1. Technical requirements
                                                  2. Building a package
                                                    1. Bringing your own package
                                                    2. Using a package manager – pip and conda
                                                    3. Creating a package scaffolding
                                                    1. Trying out code with Poetry
                                                    2. Adding actual code
                                                    3. Defining dependencies
                                                    4. Non-code resources
                                                    5. Publishing the package
                                                    6. Development workflow
                                                    1. Testing with PyTest
                                                    2. Writing our own tests
                                                    1. Technical requirements
                                                    2. Introducing the ETL pipeline
                                                      1. Redesigning your code as a pipeline
                                                      1. Connecting the dots
                                                      1. Scheduling with cron
                                                      1. Writing to an S3 bucket
                                                      2. Writing to SQL
                                                      1. Technical requirements
                                                      2. Building a dashboard – three types of dashboard
                                                        1. Static dashboards
                                                        2. Debugging Altair
                                                        3. Connecting your app to the Luigi pipeline
                                                        1. First try with panel
                                                        2. Reading data from the database
                                                        3. Creating an interactive dashboard in Jupyter
                                                        1. Technical requirements
                                                        2. What is a RESTful API?
                                                          1. Python web frameworks
                                                          1. Exploring service with OpenAPI
                                                          2. Finalizing our naive first iteration
                                                          3. Data validation
                                                          4. Sending data in with POST requests
                                                          5. Adding features to our service
                                                          1. Technical requirements
                                                          2. Understanding serverless
                                                          3. Getting started with Chalice
                                                          4. Setting up a simple model
                                                            1. Externalizing medians
                                                            1. When we're still out of memory
                                                            1. S3-triggered events
                                                            1. Technical requirements
                                                            2. Speeding up your Python code
                                                              1. Rewriting the code with NumPy
                                                              2. Specialized data structures and algorithms
                                                              3. Dask
                                                                1. Dask-ML
                                                                1. Different types of concurrency
                                                                2. Two types of problems
                                                                3. Before you start rewriting your code
                                                                1. Code formatting with black
                                                                2. Measuring code quality with Wily
                                                                3. Writing tests with hypothesis
                                                                1. Different Python flavors
                                                                2. Docker containers
                                                                3. Kubernetes
                                                                1. Chapter 1
                                                                2. Chapter 2
                                                                3. Chapter 3
                                                                4. Chapter 4
                                                                5. Chapter 5
                                                                6. Chapter 6
                                                                7. Chapter 7
                                                                8. Chapter 8
                                                                9. Chapter 9
                                                                10. Chapter 10
                                                                11. Chapter 11
                                                                12. Chapter 12
                                                                13. Chapter 13
                                                                14. Chapter 14
                                                                15. Chapter 15
                                                                16. Chapter 16
                                                                17. Chapter 17
                                                                18. Chapter 18
                                                                19. Chapter 19
                                                                20. Chapter 20
                                                                1. Leave a review - let other readers know what you think
                                                                Show and hide more

                                                                Product information

                                                                • Title: Learn Python by Building Data Science Applications
                                                                • Author(s): Philipp Kats, David Katz
                                                                • Release date: August 2019
                                                                • Publisher(s): Packt Publishing
                                                                • ISBN: 9781789535365