Livres Intelligence artificielle : Livres en anglais

couverture du livre Prompt Engineering for Generative AI

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Prompt Engineering for Generative AI

Future-Proof Inputs for Reliable AI Outputs

de James Phoenix, Mike Taylor

Public visé : Intermédiaire

Résumé de l'éditeur

Large language models (LLMs) and diffusion models such as ChatGPT and Stable Diffusion have unprecedented potential. Because they have been trained on all the public text and images on the internet, they can make useful contributions to a wide variety of tasks. And with the barrier to entry greatly reduced today, practically any developer can harness LLMs and diffusion models to tackle problems previously unsuitable for automation.

With this book, you'll gain a solid foundation in generative AI, including how to apply these models in practice. When first integrating LLMs and diffusion models into their workflows, most developers struggle to coax reliable enough results from them to use in automated systems. Authors James Phoenix and Mike Taylor show you how a set of principles called prompt engineering can enable you to work effectively with AI.

Learn how to empower AI to work for you. This book explains:

The structure of the interaction chain of your program's AI model and the fine-grained steps in between
How AI model requests arise from transforming the application problem into a document completion problem in the model training domain
The influence of LLM and diffusion model architecture—and how to best interact with it
How these principles apply in practice in the domains of natural language processing, text and image generation, and code

Édition : O'Reilly - 422 pages, 1^re édition, 25 juin 2024

ISBN10 : 109815343X - ISBN13 : 9781098153434

Commandez sur www.amazon.fr :

61.84 € TTC (prix éditeur 61.84 € TTC)

The Five Principles of Prompting
Introduction to Large Language Models for Text Generation
Standard Practices for Text Generation with ChatGPT
Advanced Techniques for Text Generation with LangChain
Vector Databases with FAISS and Pinecone
Autonomous Agents with Memory and Tools
Introduction to Diffusion Models for Image Generation
Standard Practices for Image Generation with Midjourney
Advanced Techniques for Image Generation with Stable Diffusion
Building AI-Powered Applications

Critique du livre par la rédaction Thibaut Cuvelier le 15 décembre 2024

L'ingénierie de prompt est un domaine qui semble s'être créé dans le sillage des technologies génératives pour en tirer le meilleur profit. Bien qu'au début il s'agisse surtout de recettes de cuisine arbitraires qui ne fonctionnent que sur un modèle, cette forme d'ingénierie textuelle s'est largement professionalisée, ce que vient prouver cet ouvrage. Il commence par une série de principes pour expérimenter de manière structurée, en totale indépendance du modèle de langue sous-jacent, avec deux objectifs : atteindre la fonctionnalité désirée (qu'il s'agisse de texte, d'image ou de code) avec un grande fiabilité.

À travers moult exemples pratiques (disponibles sur GitHub), les auteurs montrent surtout comment utiliser une bibliothèque comme LangChain pour atteindre des productions d'un très haut niveau de qualité avec une totale automatisation. Malgré un champ en constante évolution, ils se focalisent sur des techniques éprouvées et qui ont toutes les chances de rester utiles pour plusieurs générations de modèles de langue. Aussi, ils ne considèrent aucun mécanisme comme un remède miracle, mais comme un outil, avec ses conditions d'applications, ses avantages et ses limites.

Ce livre sera très utile à toute personne souhaitant automatiser des processus à l'aide de modèles de langue, en déterminant les points forts de cette approche (et la manière de les déclencher), mais aussi en étant conscient des potentiels problèmes à venir (et de la manière de les contourner).

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 15/12/2024 à 19:00

Prompt Engineering for Generative AI
Future-Proof Inputs for Reliable AI Outputs

Large language models (LLMs) and diffusion models such as ChatGPT and Stable Diffusion have unprecedented potential. Because they have been trained on all the public text and images on the internet, they can make useful contributions to a wide variety of tasks. And with the barrier to entry greatly reduced today, practically any developer can harness LLMs and diffusion models to tackle problems previously unsuitable for automation.

With this book, you'll gain a solid foundation in generative AI, including how to apply these models in practice. When first integrating LLMs and diffusion models into their workflows, most developers struggle to coax reliable enough results from them to use in automated systems. Authors James Phoenix and Mike Taylor show you how a set of principles called prompt engineering can enable you to work effectively with AI.

Learn how to empower AI to work for you. This book explains:

The structure of the interaction chain of your program's AI model and the fine-grained steps in between
How AI model requests arise from transforming the application problem into a document completion problem in the model training domain
The influence of LLM and diffusion model architecture—and how to best interact with it
How these principles apply in practice in the domains of natural language processing, text and image generation, and code

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre Building LLMs for Production

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Building LLMs for Production

Enhancing LLM Abilities and Reliability with Prompting, Fine-Tuning, and RAG

de Louis-Francois Bouchard, Louie Peters

Public visé : Débutant

Résumé de l'éditeur

“This is the most comprehensive textbook to date on building LLM applications - all essential topics in an AI Engineer's toolkit." - Jerry Liu, Co-founder and CEO of LlamaIndex

With amazing feedback from industry leaders, this book is an end-to-end resource for anyone looking to enhance their skills or dive into the world of AI and develop their understanding of Generative AI and Large Language Models (LLMs). It explores various methods to adapt "foundational" LLMs to specific use cases with enhanced accuracy, reliability, and scalability. Written by over 10 people on our Team at Towards AI and curated by experts from Activeloop, LlamaIndex, Mila, and more, it is a roadmap to the tech stack of the future.

The book aims to guide developers through creating LLM products ready for production, leveraging the potential of AI across various industries. It is tailored for readers with an intermediate knowledge of Python.

What's Inside this Book?

Hands-on Guide on LLMs, prompting, RAG, & Fine-tuning.
Roadmap for building production-ready applications using LLMs.
Fundamentals of LLM theory.
Simple-to-advanced LLM techniques & frameworks.
Code projects with real-world applications.
Colab notebooks that you can run right away.

Édition : Towards AI - 468 pages, 1^re édition, 21 mai 2024

ISBN10 : 8324731472 - ISBN13 : 9798324731472

Commandez sur www.amazon.fr :

52.74 € TTC (prix éditeur 52.74 € TTC)

Introduction to LLMs
LLM Architectures and Landscape
LLMs in Practice
Introduction to Prompting
Retrieval-Augmented Generation
Introduction to LangChain & LlamaIndex
Prompting with LangChain
Indexes, Retrievers, and Data Preparation
Advanced RAG
Agents
Fine-Tuning
Deployment and Optimization

Critique du livre par la rédaction Thibaut Cuvelier le 17 novembre 2024

Parmi les nombreux ouvrages sur les LLM, celui-ci se démarque par une approche résolument pratique, l'objectif final étant la mise en production d'un système complexe à base de RAG ou d'agents dans une infrastructure infonuagique. Il est d'ailleurs accompagné par un dépôt GitHub contenant les exemples de code utilisés, ainsi qu'un site ajoutant moult références.

Les auteurs détaillent les mécanismes à mettre en œuvre pour déployer son application à base de LLM, en partant des plus simples, puis en détaillant les pistes pour améliorer les résultats si cela s'avérait nécessaire, à l'aide des bibliothèques Python les plus utilisées actuellement. Ils mettent aussi en évidence des cas d'utilisation, de telle sorte que le lecteur peut rapidement déterminer où, dans un système préexistant, les LLM pourraient apporter une valeur ajoutée.

Toutefois, le livre fait l'impasse sur les sujets plus fondamentaux. Les explications sur les principes de base des LLM sont rapides (vous n'y apprendrez, par exemple, pas à entraîner un modèle de base). Ces bases théoriques ne sont néanmoins pas indispensables pour tirer pleinement profit des parties pratiques.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 25/11/2024 à 0:22

Building LLMs for Production
Enhancing LLM Abilities and Reliability with Prompting, Fine-Tuning, and RAG

“This is the most comprehensive textbook to date on building LLM applications - all essential topics in an AI Engineer's toolkit." - Jerry Liu, Co-founder and CEO of LlamaIndex

With amazing feedback from industry leaders, this book is an end-to-end resource for anyone looking to enhance their skills or dive into the world of AI and develop their understanding of Generative AI and Large Language Models (LLMs). It explores various methods to adapt "foundational" LLMs to specific use cases with enhanced accuracy, reliability, and scalability. Written by over 10 people on our Team at Towards AI and curated by experts from Activeloop, LlamaIndex, Mila, and more, it is a roadmap to the tech stack of the future.

The book aims to guide developers through creating LLM products ready for production, leveraging the potential of AI across various industries. It is tailored for readers with an intermediate knowledge of Python.

What's Inside this Book?

Hands-on Guide on LLMs, prompting, RAG, & Fine-tuning.
Roadmap for building production-ready applications using LLMs.
Fundamentals of LLM theory.
Simple-to-advanced LLM techniques & frameworks.
Code projects with real-world applications.
Colab notebooks that you can run right away.

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre Hands-On Large Language Models

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Hands-On Large Language Models

Language Understanding and Generation

de Jay Alammar, Maarten Grootendorst

Public visé : Débutant

Résumé de l'éditeur

AI has acquired startling new language capabilities in just the past few years. Driven by rapid advances in deep learning, language AI systems are able to write and understand text better than ever before. This trend is enabling new features, products, and entire industries. Through this book's visually educational nature, readers will learn practical tools and concepts they need to use these capabilities today.

You'll understand how to use pretrained large language models for use cases like copywriting and summarization; create semantic search systems that go beyond keyword matching; and use existing libraries and pretrained models for text classification, search, and clusterings.

This book also helps you:

Understand the architecture of Transformer language models that excel at text generation and representation
Build advanced LLM pipelines to cluster text documents and explore the topics they cover
Build semantic search engines that go beyond keyword search, using methods like dense retrieval and rerankers
Explore how generative models can be used, from prompt engineering all the way to retrieval-augmented generation
Gain a deeper understanding of how to train LLMs and optimize them for specific applications using generative model fine-tuning, contrastive fine-tuning, and in-context learning

Édition : O'Reilly - 425 pages, 1^re édition, 15 octobre 2024

ISBN10 : 1098150961 - ISBN13 : 9781098150952

Commandez sur www.amazon.fr :

76.89 € TTC (prix éditeur 76.89 € TTC)

Preface

Understanding Language Models

An Introduction to Large Language Models
Tokens and Embeddings
Looking Inside Large Language Models

Using Pretrained Language Models

Text Classification
Text Clustering and Topic Modeling
Prompt Engineering
Advanced Text Generation Techniques and Tools
Semantic Search and Retrieval-Augmented Generation
Multimodal Large Language Models

Training and Fine-Tuning Language Models

Creating Text Embedding Models
Fine-Tuning Representation Models for Classification
Fine-Tuning Generation Models

Afterword

Critique du livre par la rédaction Thibaut Cuvelier le 11 octobre 2024

Il paraît qu'il s'agit de la prochaine révolution industrielle : tout le monde parle d'IA générative, de grands modèles de langue. Ce livre technique vient démystifier le domaine sans entrer dans les mathématiques. Les auteurs descendent dans les concepts derrière les LLM (la première partie), mais aussi une série d'applications, qu'elles soient à la mode ou aient une utilité directe en entreprise (comme la recherche sémantique ou la classification). En permanence, ils montrent des exemples de code Python pour utiliser les modèles existants ou les améliorer, d'ailleurs disponibles sur GitHub.

L'explication des concepts est probablement le point fort de l'ouvrage. La première partie (trois chapitres) se consacre à l'étude des briques de base dont sont formés les modèles actuels. Les auteurs, avec moult schémas d'une clarté rare (imprimés en couleur), expliquent leur fonctionnement, depuis l'ingestion de texte brut jusqu'à l'inférence, et leurs évolutions récentes. Ils visent à donner une compréhension intuitive de ces modèles, plutôt qu'en forçant sur les formules. Il n'empêche, le lecteur assidu pourra profiter des citations académiques pour approfondir ce pan. On pourra juste regretter que le mélange d'experts (MoE), composante essentielle d'une série de modèles, soit ignoré et que le lecteur soit supposé avoir quelques connaissances en NLP classique (comme TF-IDF).

Les deux autres parties du livre s'intéressent à la mise en pratique, que ce soit des applications directes de modèles existants ou le peaufinage de ces modèles. Les auteurs ne s'arrêtent d'ailleurs pas au texte, notamment avec les modèles multimodaux. Ces études de cas abordent des problématiques proches de la production, comme le temps nécessaire pour l'inférence selon l'utilisation ou la technique de réitération exponentielle.

Cet ouvrage sera donc utile à toute personne curieuse ou intéressée par les applications des LLM. Vous apprendrez rapidement à passer de l'idée à la réalisation de vos projets en comprenant ce que vous faites et avec des idées pour améliorer les résultats.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 11/10/2024 à 3:43

Hands-On Large Language Models
Language Understanding and Generation

AI has acquired startling new language capabilities in just the past few years. Driven by rapid advances in deep learning, language AI systems are able to write and understand text better than ever before. This trend is enabling new features, products, and entire industries. Through this book's visually educational nature, readers will learn practical tools and concepts they need to use these capabilities today.

You'll understand how to use pretrained large language models for use cases like copywriting and summarization; create semantic search systems that go beyond keyword matching; and use existing libraries and pretrained models for text classification, search, and clusterings.

This book also helps you:

Understand the architecture of Transformer language models that excel at text generation and representation
Build advanced LLM pipelines to cluster text documents and explore the topics they cover
Build semantic search engines that go beyond keyword search, using methods like dense retrieval and rerankers
Explore how generative models can be used, from prompt engineering all the way to retrieval-augmented generation
Gain a deeper understanding of how to train LLMs and optimize them for specific applications using generative model fine-tuning, contrastive fine-tuning, and in-context learning

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

Détails du livre

Critiques (0)

0 commentaire

All-in On AI

How Smart Companies Win Big with Artificial Intelligence

de Thomas H. Davenport, Nitin Mittal

Public visé : Intermédiaire

Résumé de l'éditeur

A Wall Street Journal bestseller

A Publisher's Weekly bestseller

A fascinating look at the trailblazing companies using artificial intelligence to create new competitive advantage, from the author of the business classic, Competing on Analytics, and the head of Deloitte's US AI practice.

Though most organizations are placing modest bets on artificial intelligence, there is a world-class group of companies that are going all-in on the technology and radically transforming their products, processes, strategies, customer relationships, and cultures.

Though these organizations represent less than 1 percent of large companies, they are all high performers in their industries. They have better business models, make better decisions, have better relationships with their customers, offer better products and services, and command higher prices.

Written by bestselling author Tom Davenport and Deloitte's Nitin Mittal, All-In on AI looks at artificial intelligence at its cutting edge from the viewpoint of established companies like Anthem, Ping An, Airbus, and Capital One.

Filled with insights, strategies, and best practices, All-In on AI also provides leaders and their teams with the information they need to help their own companies take AI to the next level.

If you're curious about the next phase in the implementation of artificial intelligence within companies, or if you're looking to adopt this powerful technology in a more robust way yourself, All-In on AI will give you a rare inside look at what the leading adopters are doing, while providing you with the tools to put AI at the core of everything you do.

Édition : Harvard Business Review Press - 224 pages, 1^re édition, 24 janvier 2023

ISBN10 : 1647824699 - ISBN13 : 9781647824693

Commandez sur www.amazon.fr :

19.39 $ TTC (prix éditeur 0.00 $ TTC)

Aucune critique n'a été faite pour l'instant

Commenter Signaler un problème

pi-2r - Rédacteur

l 15/04/2024 à 9:21

All-in On AI
How Smart Companies Win Big with Artificial Intelligence

A Wall Street Journal bestseller

A Publisher's Weekly bestseller

A fascinating look at the trailblazing companies using artificial intelligence to create new competitive advantage, from the author of the business classic, Competing on Analytics, and the head of Deloitte's US AI practice.

Though most organizations are placing modest bets on artificial intelligence, there is a world-class group of companies that are going all-in on the technology and radically transforming their products, processes, strategies, customer relationships, and cultures.

Though these organizations represent less than 1 percent of large companies, they are all high performers in their industries. They have better business models, make better decisions, have better relationships with their customers, offer better products and services, and command higher prices.

Written by bestselling author Tom Davenport and Deloitte's Nitin Mittal, All-In on AI looks at artificial intelligence at its cutting edge from the viewpoint of established companies like Anthem, Ping An, Airbus, and Capital One.

Filled with insights, strategies, and best practices, All-In on AI also provides leaders and their teams with the information they need to help their own companies take AI to the next level.

If you're curious about the next phase in the implementation of artificial intelligence within companies, or if you're looking to adopt this powerful technology in a more robust way yourself, All-In on AI will give you a rare inside look at what the leading adopters are doing, while providing you with the tools to put AI at the core of everything you do.

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre Natural Language Processing in the Real World

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Natural Language Processing in the Real World

Text Processing, Analytics, and Classification

de Jyotika Singh

Public visé : Débutant

Résumé de l'éditeur

Natural Language Processing in the Real World is a practical guide for applying data science and machine learning to build Natural Language Processing (NLP) solutions. Where traditional, academic-taught NLP is often accompanied by a data source or dataset to aid solution building, this book is situated in the real world where there may not be an existing rich dataset.

This book covers the basic concepts behind NLP and text processing and discusses the applications across 15 industry verticals. From data sources and extraction to transformation and modelling, and classic Machine Learning to Deep Learning and Transformers, several popular applications of NLP are discussed and implemented.

This book provides a hands-on and holistic guide for anyone looking to build NLP solutions, from students of Computer Science to those involved in large-scale industrial projects.

Édition : CRC - 388 pages, 1^re édition, 3 juillet 2023

ISBN10 : 1032195339 - ISBN13 : 9781032195339

Commandez sur www.amazon.fr :

77.03 € TTC (prix éditeur 77.03 € TTC)

NLP Concepts

NLP Basics

Data Sources and Extraction

Data Sources and Extraction

Data Processing and Modeling

Data Preprocessing and Transformation
Data Modeling

NLP Applications across Industry Verticals

NLP Applications – Active Usage
NLP Applications – Developing Usage

Implementing Advanced NLP Applications

Information Extraction and Text Transforming Models
Text Categorisation and Affinities

Implementing NLP Projects in the Real-World

Chatbots
Customer Review Analysis
Recommendations and Predictions
More Real-World Scenarios and Tips

Critique du livre par la rédaction Thibaut Cuvelier le 21 janvier 2024

Le domaine du traitement automatique des langues a vu une énorme évolution ces dernières années, en culminant avec des modèles statistiques comme GPT-4. Cependant, ces améliorations sont restées cantonnées dans des laboratoires industriels ou quelques produits phares. L'objectif de l'ouvrage est de diffuser ces techniques, des plus basiques aux plus avancées (reconnaissance d'entités, traduction, agents conversationnels, recommandations), pour qu'elles soient utilisées à plus grande échelle. L'un de ses traits principaux tient probablement dans un diptyque : la partie IV présente une série de domaines et d'applications potentielles du traitement des langues ; les parties V et VI montrent comment utiliser des bibliothèques existantes pour déployer ces fonctionnalités.

L'auteure considère que le lecteur a une certaine habitude de traiter des données, mais sans connaître les arcanes mathématiques qui sous-tendent les techniques habituelles. Ainsi, on ne voit qu'extrêmement peu de mathématiques dans tout le livre, tout au plus quelques formules quand il est impossible de faire autrement. L'essentiel contenu tourne autour de l'application de la NLP à des cas pratiques, après une introduction donnant les principes généraux. Le texte oublie régulièrement d'être précis (sur des mathématiques, des algorithmes ou de la linguistique) et préfère faire passer l'idée principale. Ainsi, l'auteure ne détaille pas la majorité des algorithmes, préférant montrer comment les utiliser dans une application.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 21/01/2024 à 22:05

Natural Language Processing in the Real World
Text Processing, Analytics, and Classification

Natural Language Processing in the Real World is a practical guide for applying data science and machine learning to build Natural Language Processing (NLP) solutions. Where traditional, academic-taught NLP is often accompanied by a data source or dataset to aid solution building, this book is situated in the real world where there may not be an existing rich dataset.

This book covers the basic concepts behind NLP and text processing and discusses the applications across 15 industry verticals. From data sources and extraction to transformation and modelling, and classic Machine Learning to Deep Learning and Transformers, several popular applications of NLP are discussed and implemented.

This book provides a hands-on and holistic guide for anyone looking to build NLP solutions, from students of Computer Science to those involved in large-scale industrial projects.

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Text Analytics

An Introduction to the Science and Applications of Unstructured Information Analysis

de John Atkinson-Abutridy

Public visé : Débutant

Résumé de l'éditeur

Text Analytics: An Introduction to the Science and Applications of Unstructured Information Analysis is a concise and accessible introduction to the science and applications of text analytics (or text mining), which enables automatic knowledge discovery from unstructured information sources, for both industrial and academic purposes. The book introduces the main concepts, models, and computational techniques that enable the reader to solve real decision-making problems arising from textual and/or documentary sources.

Features:

Easy-to-follow step-by-step concepts and methods
Every chapter is introduced in a very gentle and intuitive way so students can understand the WHYs, WHAT-IFs, WHAT-IS-THIS-FORs, HOWs, etc. by themselves
Practical programming exercises in Python for each chapter
Includes theory and practice for every chapter, summaries, practical coding exercises for target problems, QA, and sample code and data available for download at https://www.routledge.com/Atkinson-Abutridy/p/book/9781032249797

Édition : CRC Press - 258 pages, 1^re édition, 29 avril 2022

ISBN10 : 1032245263 - ISBN13 : 9781032245263

Commandez sur www.amazon.fr :

57.96 € TTC (prix éditeur 57.98 € TTC)

Text Analytics
Natural-Language Processing
Information Extraction
Document Representation
Association Rules Mining
Corpus-Based Semantic Analysis
Document Clustering
Topic Modeling
Document Categorization

Critique du livre par la rédaction Thibaut Cuvelier le 19 juillet 2023

La pratique moderne de l'analyse de texte exploite presque exclusivement les réseaux neuronaux, au point que certains en négligent les bases lors de leur apprentissage. Cet ouvrage se focalise sur les approches plus classiques, sans forcément utiliser de réseau neuronal à chaque étape (les LLM sont cités, mais pas expliqués). L'objectif de l'auteur est de se focaliser sur une compréhension en profondeur des concepts avec une certaine simplicité d'énonciation, sans verser dans la surenchère d'algorithmes. Toutefois, l'ouvrage couvre une grande variété de sujets, même si, au regard de l'actualité, on aurait aimé voir l'aspect génératif mis en avant.

Le lecteur devra s'armer d'une certaine connaissance d'un langage de programmation comme Python, car la pratique est un élément très important de ce livre. Chaque chapitre comporte une section d'exercices en Python, certains guidés, d'autres non, exploitant les données à disposition sur le site du livre.

Les explications font souvent appel à l'intuition pour éviter un formalisme excessif ou pour transmettre une expérience (pourquoi certaines approches fonctionnent dans certains cas et pas dans d'autres, quelles informations une technique particulière pourra récupérer d'un texte, etc.). Il n'empêche que l'approche mathématique est importante pour bien saisir le détail des algorithmes (ce qui nécessite un certain bagage en algèbre linéaire ou probabilités, notamment).

L'ouvrage se termine par un glossaire des mots les plus importants (une chose que l'on aimerait voir bien plus souvent) et une bibliographie académique. Les références bibliographiques sont présentes dans tout le livre et pointent vers des articles pédagogiques ou d'autres livres plutôt que vers les articles introduisant une technique.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 19/07/2023 à 2:21

Text Analytics
An Introduction to the Science and Applications of Unstructured Information Analysis

Text Analytics: An Introduction to the Science and Applications of Unstructured Information Analysis is a concise and accessible introduction to the science and applications of text analytics (or text mining), which enables automatic knowledge discovery from unstructured information sources, for both industrial and academic purposes. The book introduces the main concepts, models, and computational techniques that enable the reader to solve real decision-making problems arising from textual and/or documentary sources.

Features:

Easy-to-follow step-by-step concepts and methods
Every chapter is introduced in a very gentle and intuitive way so students can understand the WHYs, WHAT-IFs, WHAT-IS-THIS-FORs, HOWs, etc. by themselves
Practical programming exercises in Python for each chapter
Includes theory and practice for every chapter, summaries, practical coding exercises for target problems, QA, and sample code and data available for download at https://www.routledge.com/Atkinson-Abutridy/p/book/9781032249797

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre Practical Simulations for Machine Learning

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Practical Simulations for Machine Learning

Using Synthetic Data for AI

de Paris Buttfield-Addison, Mars Buttfield-Addison, Tim Nugent, Jon Manning

Public visé : Débutant

Résumé de l'éditeur

Simulation and synthesis are core parts of the future of AI and machine learning. Consider: programmers, data scientists, and machine learning engineers can create the brain of a self-driving car without the car. Rather than use information from the real world, you can synthesize artificial data using simulations to train traditional machine learning models.Thatâ??s just the beginning.

With this practical book, youâ??ll explore the possibilities of simulation- and synthesis-based machine learning and AI, concentrating on deep reinforcement learning and imitation learning techniques. AI and ML are increasingly data driven, and simulations are a powerful, engaging way to unlock their full potential.

You'll learn how to:

Design an approach for solving ML and AI problems using simulations with the Unity engine
Use a game engine to synthesize images for use as training data
Create simulation environments designed for training deep reinforcement learning and imitation learning models
Use and apply efficient general-purpose algorithms for simulation-based ML, such as proximal policy optimization
Train a variety of ML models using different approaches
Enable ML tools to work with industry-standard game development tools, using PyTorch, and the Unity ML-Agents and Perception Toolkits

Édition : O'Reilly - 500 pages, 1^re édition, 21 juin 2022

ISBN10 : 1492089923 - ISBN13 : 9781492089926

Commandez sur www.amazon.fr :

45.37 € TTC (prix éditeur 45.37 € TTC)

The Basics of Simulation and Synthesis

Introducing Synthesis and Simulation
Creating Your First Simulation
Creating Your First Synthesized Data

Simulating Worlds for Fun and Profit

Creating a More Advanced Simulation
Creating a Self-Driving Car
Introducing Imitation Learning
Advanced Imitation Learning
Introducing Curriculum Learning
Cooperative Learning
Using Cameras in Simulations
Working with Python
Under the Hood and Beyond

Synthetic Data, Real Results

Creating More Advanced Synthesized Data
Synthetic Shopping

Critique du livre par la rédaction Thibaut Cuvelier le 19 juillet 2023

On trouve une pléthore de livres sur l'intelligence artificielle, mais celui-ci couvre un domaine encore peu exploré : l'utilisation d'outils informatiques pour générer des données, pour des systèmes à déployer dans le monde réel (synthèse) ou non (simulation). Plus particulièrement, il s'agit ici d'utiliser Unity et sa bibliothèque Machine Learning Agents Toolkit pour créer des agents par des techniques d'apprentissage par renforcement.

L'ouvrage considère assez peu de prérequis, si ce n'est une habitude de programmation, idéalement en C# et un peu de Python. Les auteurs présentent l'essentiel des fonctionnalités de Unity pour ces cas d'utilisation, ce qui mène à des scènes assez simples. En ce qui concerne l'apprentissage par renforcement, ils introduisent les concepts de base, mais ce n'est clairement pas un livre sur le sujet : tous les exemples utilisent les algorithmes existants de ml-agents, malgré un chapitre d'ouverture pour ceux qui souhaitent coder leur propre technique d'apprentissage par renforcement.

Globalement, cet ouvrage est un tutoriel ml-agents, avec des morceaux de code C# bien expliqués et les idées sous-jacentes détaillées. Les auteurs considèrent que la pratique est essentielle pour intégrer les concepts nécessaires.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 19/07/2023 à 1:53

Practical Simulations for Machine Learning
Using Synthetic Data for AI

Simulation and synthesis are core parts of the future of AI and machine learning. Consider: programmers, data scientists, and machine learning engineers can create the brain of a self-driving car without the car. Rather than use information from the real world, you can synthesize artificial data using simulations to train traditional machine learning models.Thatâ??s just the beginning.

With this practical book, youâ??ll explore the possibilities of simulation- and synthesis-based machine learning and AI, concentrating on deep reinforcement learning and imitation learning techniques. AI and ML are increasingly data driven, and simulations are a powerful, engaging way to unlock their full potential.

You'll learn how to:

Design an approach for solving ML and AI problems using simulations with the Unity engine
Use a game engine to synthesize images for use as training data
Create simulation environments designed for training deep reinforcement learning and imitation learning models
Use and apply efficient general-purpose algorithms for simulation-based ML, such as proximal policy optimization
Train a variety of ML models using different approaches
Enable ML tools to work with industry-standard game development tools, using PyTorch, and the Unity ML-Agents and Perception Toolkits

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre Natural Language Processing with Transformers

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Natural Language Processing with Transformers

Building Language Applications with Hugging Face

de Lewis Tunstall, Leandro von Werra, Thomas Wolf

Public visé : Débutant

Résumé de l'éditeur

Since their introduction in 2017, transformers have quickly become the dominant architecture for achieving state-of-the-art results on a variety of natural language processing tasks. If you're a data scientist or coder, this practical book -now revised in full color- shows you how to train and scale these large models using Hugging Face Transformers, a Python-based deep learning library.

Transformers have been used to write realistic news stories, improve Google Search queries, and even create chatbots that tell corny jokes. In this guide, authors Lewis Tunstall, Leandro von Werra, and Thomas Wolf, among the creators of Hugging Face Transformers, use a hands-on approach to teach you how transformers work and how to integrate them in your applications. You'll quickly learn a variety of tasks they can help you solve.

Build, debug, and optimize transformer models for core NLP tasks, such as text classification, named entity recognition, and question answering
Learn how transformers can be used for cross-lingual transfer learning
Apply transformers in real-world scenarios where labeled data is scarce
Make transformer models efficient for deployment using techniques such as distillation, pruning, and quantization
Train transformers from scratch and learn how to scale to multiple GPUs and distributed environments

Édition : O'Reilly - 406 pages, 1^re édition, 17 juin 2022

ISBN10 : 1098136799 - ISBN13 : 9781098136796

Commandez sur www.amazon.fr :

63.24 € TTC (prix éditeur 63.24 € TTC)

Hello Transformers
Text Classification
Transformer Anatomy
Multilingual Named Entity Recognition
Text Generation
Summarization
Question Answering
Making Transformers Efficient in Production
Dealing with Few to No Labels
Training Transformers from Scratch
Future Directions

Critique du livre par la rédaction Thibaut Cuvelier le 18 juillet 2023

L'actualité déborde d'articles sur les dernières prouesses de l'« intelligence artificielle », c'est-à-dire bien souvent ChatGPT ou un équivalent. Cet ouvrage propose un aperçu de la bibliothèque Python principale pour faire la une des journaux, Hugging Face Transformers. Bien qu'elle ne soit pas uniquement utilisable pour du traitement de texte, c'est néanmoins presque toujours le cas et le livre ne considère également presque que les applications avec du texte.

L'ouvrage consiste surtout en un tutoriel pour cette bibliothèque, il est organisé selon les applications que l'on souhaite en réaliser. Chaque chapitre contient plusieurs exemples d'utilisation, simples et bien conçus, après une courte introduction sur les principes mis en œuvre. Les premières utilisations de la bibliothèque sont assez simples, avec uniquement des modèles préentraînés, mais le dernier chapitre montre comment réaliser son propre modèle à base de transformateurs de zéro.

Un chapitre est dédié à la technique des transformateurs, le cœur des modèles offerts par Hugging Face Transformers. Celui-ci est un peu court pour bien comprendre le principe, mais il en donne un aperçu suffisant. Régulièrement, les auteurs donnent des liens vers la littérature scientifique afin d'approfondir ses connaissances.

Les auteurs considèrent que le lecteur connaît déjà Python et a quelques bases en apprentissage profond, même si ces dernières ne sont pas véritablement utiles pour profiter de la majorité du livre. Seule l'utilisation de la bibliothèque PyTorch est détaillée, bien que Hugging Face Transformers puisse aussi utiliser TensorFlow (expédiée en une section). Cependant, aucune connaissance en traitement de texte n'est requise.

À l'achat, achetez l'édition révisée : outre quelques corrections par rapport à la première édition, elle est imprimée en couleur, ce qui se révèle utile à la lecture.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 18/07/2023 à 3:24

Natural Language Processing with Transformers
Building Language Applications with Hugging Face

Since their introduction in 2017, transformers have quickly become the dominant architecture for achieving state-of-the-art results on a variety of natural language processing tasks. If you're a data scientist or coder, this practical book -now revised in full color- shows you how to train and scale these large models using Hugging Face Transformers, a Python-based deep learning library.

Transformers have been used to write realistic news stories, improve Google Search queries, and even create chatbots that tell corny jokes. In this guide, authors Lewis Tunstall, Leandro von Werra, and Thomas Wolf, among the creators of Hugging Face Transformers, use a hands-on approach to teach you how transformers work and how to integrate them in your applications. You'll quickly learn a variety of tasks they can help you solve.

Build, debug, and optimize transformer models for core NLP tasks, such as text classification, named entity recognition, and question answering
Learn how transformers can be used for cross-lingual transfer learning
Apply transformers in real-world scenarios where labeled data is scarce
Make transformer models efficient for deployment using techniques such as distillation, pruning, and quantization
Train transformers from scratch and learn how to scale to multiple GPUs and distributed environments

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre Deep Learning on Graphs

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Deep Learning on Graphs

de Yao Ma, Jiliang Tang

Public visé : Débutant

Résumé de l'éditeur

'This timely book covers a combination of two active research areas in AI: deep learning and graphs. It serves the pressing need for researchers, practitioners, and students to learn these concepts and algorithms, and apply them in solving real-world problems. Both authors are world-leading experts in this emerging area.' — Huan Liu, Arizona State University

'Deep learning on graphs is an emerging and important area of research. This book by Yao Ma and Jiliang Tang covers not only the foundations, but also the frontiers and applications of graph deep learning. This is a must-read for anyone considering diving into this fascinating area.' — Shuiwang Ji, Texas A&M University

Deep learning on graphs has become one of the hottest topics in machine learning. The book consists of four parts to best accommodate our readers with diverse backgrounds and purposes of reading. Part 1 introduces basic concepts of graphs and deep learning; Part 2 discusses the most established methods from the basic to advanced settings; Part 3 presents the most typical applications including natural language processing, computer vision, data mining, biochemistry and healthcare; and Part 4 describes advances of methods and applications that tend to be important and promising for future research. The book is self-contained, making it accessible to a broader range of readers including (1) senior undergraduate and graduate students; (2) practitioners and project managers who want to adopt graph neural networks into their products and platforms; and (3) researchers without a computer science background who want to use graph neural networks to advance their disciplines.

Édition : Cambridge - 400 pages, 1^re édition, 9 décembre 2021

ISBN10 : 1108831745 - ISBN13 : 9781108831741

Commandez sur www.amazon.fr :

57.55 € TTC (prix éditeur 57.55 € TTC)

Introduction

Deep Learning on Graphs: An Introduction

Foundations

Foundations of Graphs
Foundations of Deep Learning

Methods

Graph Embedding
Graph Neural Networks
Robust Graph Neural Networks
Scalable Graph Neural Networks
Graph Neural Networks on Complex Graphs
Beyond GNNs: More Deep Models on Graphs

Applications

Graph Neural Networks in Natural Language Processing
Graph Neural Networks in Computer Vision
Graph Neural Networks in Data Mining
Graph Neural Networks in Biochemistry and Healthcare

Advances

Advanced Topics in Graph Neural Networks
Advanced Applications in Graph Neural Networks

Critique du livre par la rédaction Thibaut Cuvelier le 2 décembre 2022

Les graphes ont longtemps été un parent pauvre dans le domaine de l'apprentissage automatique, peu de techniques se généralisant à des structures aussi compliquées. Les réseaux neuronaux sont actuellement à la pointe de la recherche dans le domaine et percolent dans les applications industrielles. Ce livre se place entre les deux, avec une couverture détaillée de la théorie sous-jacente aux GNN et un approfondissement de quelques applications.

Les auteurs partent d'un niveau de connaissances en graphes et réseaux neuronaux presque nul et apportent moult références scientifiques, principalement très récentes. Ils fournissent les résultats théoriques principaux dans leur totalité (preuves complètes ou détails mathématiques des architectures de réseaux neuronaux), ce qui a malheureusement parfois tendance à limiter la lisibilité de l'ouvrage.

Ce livre est une revue de littérature poussée du domaine, les auteurs couvrant des types de graphes ayant attiré plus ou moins de succès (dirigés ou non, mais aussi hétérogènes, multipartites, hypergraphes). Ils cherchent à unifier des résultats d'apparence très différente sous un même vocabulaire.

Les applications couvertes partent des plus classiques au plus récentes (y compris en optimisation combinatoire). Les GNN sont mis à toutes les sauces, y compris pour de la génération de données (GAN) ou de la modélisation non supervisée (auto-encodeurs variationnels). On peut toutefois regretter que la pratique se limite à la description des architectures de réseaux neuronaux déployées, sans indiquer de piste pour en débuter l'implémentation.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 02/12/2022 à 5:01

Deep Learning on Graphs

'This timely book covers a combination of two active research areas in AI: deep learning and graphs. It serves the pressing need for researchers, practitioners, and students to learn these concepts and algorithms, and apply them in solving real-world problems. Both authors are world-leading experts in this emerging area.' — Huan Liu, Arizona State University

'Deep learning on graphs is an emerging and important area of research. This book by Yao Ma and Jiliang Tang covers not only the foundations, but also the frontiers and applications of graph deep learning. This is a must-read for anyone considering diving into this fascinating area.' — Shuiwang Ji, Texas A&M University

Deep learning on graphs has become one of the hottest topics in machine learning. The book consists of four parts to best accommodate our readers with diverse backgrounds and purposes of reading. Part 1 introduces basic concepts of graphs and deep learning; Part 2 discusses the most established methods from the basic to advanced settings; Part 3 presents the most typical applications including natural language processing, computer vision, data mining, biochemistry and healthcare; and Part 4 describes advances of methods and applications that tend to be important and promising for future research. The book is self-contained, making it accessible to a broader range of readers including (1) senior undergraduate and graduate students; (2) practitioners and project managers who want to adopt graph neural networks into their products and platforms; and (3) researchers without a computer science background who want to use graph neural networks to advance their disciplines.

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre Scientific Writing 3.0

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Scientific Writing 3.0

A Reader and Writer's Guide

de Jean-Luc Lebrun, Justin Lebrun

Public visé : Intermédiaire

Résumé de l'éditeur

The third edition of this book aims to equip both young and experienced researchers with all the tools and strategy they will need for their papers to not just be accepted, but stand out in the crowded field of academic publishing. It seeks to question and deconstruct the legacy of existing science writing, replacing or supporting historically existing practices with principle- and evidence-driven styles of effective writing. It encourages a reader-centric approach to writing, satisfying reader-scientists at large, but also the paper's most powerful readers, the reviewer and editor. Going beyond the baseline of well-structured scientific writing, this book leverages an understanding of human physiological limitations (memory, attention, time) to help the author craft a document that is optimized for readability.

Through real and fictional examples, hands-on exercises, and entertaining stories, this book breaks down the critical parts of a typical scientific paper (Title, Abstract, Introduction, Visuals, Structure, and Conclusions). It shows at great depth how to achieve the essential qualities required in scientific writing, namely being clear, concise, convincing, fluid, interesting, and organized. To enable the writer to assess whether these parts are well written from a reader's perspective, the book also offers practical metrics in the form of six checklists, and even an original Java application to assist in the evaluation.

Édition : World Scientific - 316 pages, 3^e édition, 21 octobre 2021

ISBN10 : 9811229538 - ISBN13 : 9789811229534

Commandez sur www.amazon.fr :

37.09 € TTC (prix éditeur 37.09 € TTC)

The Reading Toolkit

Writer vs. Reader, a Matter of Attitude
Strategic Writing
The Scientific Writing Style
Require Less from Memory
Sustain Attention to Ensure Continuous Reading
Reduce Reading Time
Keep the Reader Motivated
Bridge the Knowledge Gap
Set the Reader's Expectations
Set Progression Tracks for Fluid Reading
Detect Sentence Fluidity Problems
Control Reading Energy Consumption

Paper Structure and Purpose

Title: The Face of Your Paper
Abstract: The Heart of Your Paper
Headings-Subheadings: The Skeleton of Your Paper
Introduction: The Hands of Your Paper
Introduction Part II: Popular Traps
Visuals: The Voice of Your Paper
Conclusions: The Smile of Your Paper
Additional Resources for the Avid Learner

Critique du livre par la rédaction Thibaut Cuvelier le 9 novembre 2022

Dans les domaines scientifiques, écrire pour présenter son travail semble facile, une fois le travail effectué. Cependant, il ne faut pas oublier que l'écriture n'est pas l'objectif poursuivi : pour atteindre la meilleure efficacité, l'auteur doit penser au lecteur, présenter ses idées pour que le lecteur les comprenne facilement. C'est le point de vue qu'ont pris les auteurs de cet ouvrage. Il s'adresse clairement à un public scientifique, peu importe le domaine des sciences, en visant une publication dans des journaux ou conférences académiques, mais la majorité de son contenu s'appliquera très bien à d'autres formes de communication technique par écrit, par exemple après une analyse d'un jeu de données. Bien que le livre soit en anglais et les exemples aussi, il s'appliquera sans problème à d'autres langues (certaines sections plus grammaticales mises de côté).

Le livre est découpé en deux parties, la première se focalise sur l'écriture plus générique, la seconde sur la structure d'un article académique. Les chapitres ne traitent qu'un sujet, avec moult exemples (bons et moins bons), et se terminent par des listes de contrôle des points essentiels. Régulièrement, on retrouve le personnage de Vladimir pour exemplifier certains problèmes de manière plus légère.

Avec cet ouvrage, les auteurs remplissent un vide criant dans l'éducation scientifique classique : peu de gens sortent d'études supérieures avec un bon bagage en rédaction, alors que cette compétence est essentielle pour la majorité des emplois intellectuels.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 10/11/2022 à 0:26

Scientific Writing 3.0
A Reader and Writer's Guide

The third edition of this book aims to equip both young and experienced researchers with all the tools and strategy they will need for their papers to not just be accepted, but stand out in the crowded field of academic publishing. It seeks to question and deconstruct the legacy of existing science writing, replacing or supporting historically existing practices with principle- and evidence-driven styles of effective writing. It encourages a reader-centric approach to writing, satisfying reader-scientists at large, but also the paper's most powerful readers, the reviewer and editor. Going beyond the baseline of well-structured scientific writing, this book leverages an understanding of human physiological limitations (memory, attention, time) to help the author craft a document that is optimized for readability.

Through real and fictional examples, hands-on exercises, and entertaining stories, this book breaks down the critical parts of a typical scientific paper (Title, Abstract, Introduction, Visuals, Structure, and Conclusions). It shows at great depth how to achieve the essential qualities required in scientific writing, namely being clear, concise, convincing, fluid, interesting, and organized. To enable the writer to assess whether these parts are well written from a reader's perspective, the book also offers practical metrics in the form of six checklists, and even an original Java application to assist in the evaluation.

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre Reinforcement Learning and Stochastic Optimization

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Reinforcement Learning and Stochastic Optimization

A Unified Framework for Sequential Decisions

de Warren B. Powell

Public visé : Débutant

Résumé de l'éditeur

Sequential decision problems, which consist of “decision, information, decision, information,” are ubiquitous, spanning virtually every human activity ranging from business applications, health (personal and public health, and medical decision making), energy, the sciences, all fields of engineering, finance, and e-commerce. The diversity of applications attracted the attention of at least 15 distinct fields of research, using eight distinct notational systems which produced a vast array of analytical tools. A byproduct is that powerful tools developed in one community may be unknown to other communities.

Reinforcement Learning and Stochastic Optimization offers a single canonical framework that can model any sequential decision problem using five core components: state variables, decision variables, exogenous information variables, transition function, and objective function. This book highlights twelve types of uncertainty that might enter any model and pulls together the diverse set of methods for making decisions, known as policies, into four fundamental classes that span every method suggested in the academic literature or used in practice.

Reinforcement Learning and Stochastic Optimization is the first book to provide a balanced treatment of the different methods for modeling and solving sequential decision problems, following the style used by most books on machine learning, optimization, and simulation. The presentation is designed for readers with a course in probability and statistics, and an interest in modeling and applications. Linear programming is occasionally used for specific problem classes. The book is designed for readers who are new to the field, as well as those with some background in optimization under uncertainty.

Throughout this book, readers will find references to over 100 different applications, spanning pure learning problems, dynamic resource allocation problems, general state-dependent problems, and hybrid learning/resource allocation problems such as those that arose in the COVID pandemic. There are 370 exercises, organized into seven groups, ranging from review questions, modeling, computation, problem solving, theory, programming exercises and a “diary problem” that a reader chooses at the beginning of the book, and which is used as a basis for questions throughout the rest of the book.

Édition : Wiley - 1136 pages, 1^re édition, 25 mars 2022

ISBN10 : 1119815037 - ISBN13 : 9781119815037

Commandez sur www.amazon.fr :

149.43 € TTC (prix éditeur 149.43 € TTC)

Introduction

Sequential Decision Problems
Canonical Problems and Applications
Online Learning
Introduction to Stochastic Search

Stochastic Search

Derivative-Based Stochastic Search
Stepsize Policies
Derivative-Free Stochastic Search

State-dependent Problems

State-dependent Problems
Modeling Sequential Decision Problems
Uncertainty Modeling
Designing Policies

Policy Search

Policy Function Approximations and Policy Search
Cost Function Approximations

Lookahead Policies

Exact Dynamic Programming
Backward Approximate Dynamic Programming
Forward ADP I: The Value of a Policy
Forward ADP II: Policy Optimization
Forward ADP III: Convex Resource Allocation Problems
Direct Lookahead Policies

Multiagent Systems

Multiagent Modeling and Learning

Critique du livre par la rédaction Thibaut Cuvelier le 6 octobre 2022

Les succès de l'apprentissage par renforcement font régulièrement la une de la presse informatique, parfois même grand public, mais il n'empêche que le domaine reste formé d'une grande variété d'approches avec des liens peu évidents entre elles. Cet ouvrage propose une théorie qui unifie toutes les approches de l'apprentissage par renforcement, un domaine que l'auteur préfère appeler prise de décision séquentielle pour plus d'exactitude. On rencontre dans ce livre des techniques d'optimisation mathématique stochastique, de simulation-optimisation, de contrôle optimal, entre autres.

L'ouvrage s'oriente surtout autour d'une typologie que l'auteur a développée tout au long de sa carrière pour caractériser les manières d'aborder un problème de décision séquentielle. Il la prétend exhaustive, dans le sens que toute technique de décision séquentielle doit s'y inscrire. En guise de preuve, il explique comment les approches de quinze domaines différents se situent sur cette carte du domaine, en partant des problématiques relativement simples d'estimation stochastique aux approches les plus avancées.

L'autre point saillant de l'ouvrage est sans conteste l'importance de la modélisation par rapport aux algorithmes. De fait, la modélisation sert à poser le problème à résoudre et ses caractéristiques, ce qui guide le choix d'un algorithme probablement efficace sans s'enfermer dans le carcan des méthodes classiques d'un domaine particulier. La modélisation est abordée de manière théorique, mais surtout pratique, avec un grand nombre d'exemples (approfondis dans l'ouvrage compagnon disponible sur Internet). Globalement, l'ouvrage se destine surtout à des praticiens, même si les liens à la recherche académique restent très importants. L'auteur transmet son expérience, notamment des aspects ignorés par les théoriciens qui ne se révèlent pleinement que dans la pratique, comme le choix d'hyperparamètres.

La majorité des chapitres contient une section « Why does it work? » pour détailler, d'un point de vue mathématique, les preuves omises dans le texte. L'ouvrage se cherche à l'intersection du tutoriel et du livre de référence, ce qui ne contribue pas à sa clarté globale.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 07/10/2022 à 0:27

Reinforcement Learning and Stochastic Optimization
A Unified Framework for Sequential Decisions

Sequential decision problems, which consist of “decision, information, decision, information,” are ubiquitous, spanning virtually every human activity ranging from business applications, health (personal and public health, and medical decision making), energy, the sciences, all fields of engineering, finance, and e-commerce. The diversity of applications attracted the attention of at least 15 distinct fields of research, using eight distinct notational systems which produced a vast array of analytical tools. A byproduct is that powerful tools developed in one community may be unknown to other communities.

Reinforcement Learning and Stochastic Optimization offers a single canonical framework that can model any sequential decision problem using five core components: state variables, decision variables, exogenous information variables, transition function, and objective function. This book highlights twelve types of uncertainty that might enter any model and pulls together the diverse set of methods for making decisions, known as policies, into four fundamental classes that span every method suggested in the academic literature or used in practice.

Reinforcement Learning and Stochastic Optimization is the first book to provide a balanced treatment of the different methods for modeling and solving sequential decision problems, following the style used by most books on machine learning, optimization, and simulation. The presentation is designed for readers with a course in probability and statistics, and an interest in modeling and applications. Linear programming is occasionally used for specific problem classes. The book is designed for readers who are new to the field, as well as those with some background in optimization under uncertainty.

Throughout this book, readers will find references to over 100 different applications, spanning pure learning problems, dynamic resource allocation problems, general state-dependent problems, and hybrid learning/resource allocation problems such as those that arose in the COVID pandemic. There are 370 exercises, organized into seven groups, ranging from review questions, modeling, computation, problem solving, theory, programming exercises and a “diary problem” that a reader chooses at the beginning of the book, and which is used as a basis for questions throughout the rest of the book.

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre Interpretable Machine Learning with Python

Détails du livre

Sommaire

Critiques (1)

1 commentaire

Interpretable Machine Learning with Python

Learn to build interpretable high-performance models with hands-on real-world examples

de Serg Masís

Public visé : Intermédiaire

Résumé de l'éditeur

Do you want to understand your models and mitigate risks associated with poor predictions using machine learning (ML) interpretation? Interpretable Machine Learning with Python can help you work effectively with ML models.

The first section of the book is a beginner's guide to interpretability, covering its relevance in business and exploring its key aspects and challenges. You'll focus on how white-box models work, compare them to black-box and glass-box models, and examine their trade-off. The second section will get you up to speed with a vast array of interpretation methods, also known as Explainable AI (XAI) methods, and how to apply them to different use cases, be it for classification or regression, for tabular, time-series, image or text. In addition to the step-by-step code, the book also helps the reader to interpret model outcomes using examples. In the third section, you'll get hands-on with tuning models and training data for interpretability by reducing complexity, mitigating bias, placing guardrails, and enhancing reliability. The methods you'll explore here range from state-of-the-art feature selection and dataset debiasing methods to monotonic constraints and adversarial retraining.

By the end of this book, you'll be able to understand ML models better and enhance them through interpretability tuning.
What you will learn

Recognize the importance of interpretability in business
Study models that are intrinsically interpretable such as linear models, decision trees, and Naive Bayes
Become well-versed in interpreting models with model-agnostic methods
Visualize how an image classifier works and what it learns
Understand how to mitigate the influence of bias in datasets
Discover how to make models more reliable with adversarial robustness
Use monotonic constraints to make fairer and safer models

Édition : Packt - 736 pages, 1^re édition, 26 mars 2021

ISBN10 : 180020390X - ISBN13 : 9781800203907

Commandez sur www.amazon.fr :

46.47 € TTC (prix éditeur 46.47 € TTC)

Introduction to Machine Learning Interpretation

Interpretation, Interpretability, and Explainability; and Why Does It All Matter?
Key Concepts of Interpretability
Interpretation Challenges

Mastering Interpretation Methods

Fundamentals of Feature Importance and Impact
Global Model-Agnostic Interpretation Methods
Local Model-Agnostic Interpretation Methods
Anchor and Counterfactual Explanations
Visualizing Convolutional Neural Networks
Interpretation Methods for Multivariate Forecasting and Sensitivity Analysis

Tuning for Interpretability

Feature Selection and Engineering for Interpretability
Bias Mitigation and Causal Inference Methods
Monotonic Constraints and Model Tuning for Interpretability
Adversarial Robustness
What's Next for Machine Learning Interpretability?

Critique du livre par la rédaction Thibaut Cuvelier le 7 mars 2022

Dans le vaste champ de l'apprentissage automatique, il est très courant de se satisfaire d'un modèle qui semble avoir une bonne performance d'après les tests effectués. Cependant, ce n'est pas toujours suffisant, c'est là que l'interprétabilité du modèle devient importante : pourquoi prédit-il telle ou telle classe ? Perpétue-t-il des stéréotypes ? L'interprétation peut aussi expliquer pourquoi un modèle ne fonctionne pas bien en pratique, car on peut alors s'intéresser aux raisons des prédictions.

Le domaine est, pour le moment, à la pointe des développements scientifiques, l'auteur fait d'ailleurs régulièrement référence à la littérature. Cependant, il n'oublie pas le côté appliqué : les chapitres se développent autour d'exemples concrets, variés et réalistes (basés sur des jeux de données librement accessibles), en présentant le code utilisé pour arriver à interpréter des aspects d'un modèle d'apprentissage. Néanmoins, l'auteur considère toujours que les concepts sont plus importants que le code et les explique de manière intuitive (sans rentrer dans une multitude de détails).

Le livre présente bon nombre de méthodes pour l'interprétation de modèles, il ne se focalise pas sur les réseaux neuronaux profonds : au contraire, la majorité des méthodes ne dépend pas vraiment de l'algorithme d'apprentissage.

Au niveau de la présentation, toutefois, on peut regretter que les images soient en noir et blanc (alors que le texte fait référence directement à la couleur) et que la qualité des images et tableaux compromette leur lisibilité.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 07/03/2022 à 4:12

Interpretable Machine Learning with Python
Learn to build interpretable high-performance models with hands-on real-world examples

Do you want to understand your models and mitigate risks associated with poor predictions using machine learning (ML) interpretation? Interpretable Machine Learning with Python can help you work effectively with ML models.

The first section of the book is a beginner's guide to interpretability, covering its relevance in business and exploring its key aspects and challenges. You'll focus on how white-box models work, compare them to black-box and glass-box models, and examine their trade-off. The second section will get you up to speed with a vast array of interpretation methods, also known as Explainable AI (XAI) methods, and how to apply them to different use cases, be it for classification or regression, for tabular, time-series, image or text. In addition to the step-by-step code, the book also helps the reader to interpret model outcomes using examples. In the third section, you'll get hands-on with tuning models and training data for interpretability by reducing complexity, mitigating bias, placing guardrails, and enhancing reliability. The methods you'll explore here range from state-of-the-art feature selection and dataset debiasing methods to monotonic constraints and adversarial retraining.

By the end of this book, you'll be able to understand ML models better and enhance them through interpretability tuning.
What you will learn

Recognize the importance of interpretability in business
Study models that are intrinsically interpretable such as linear models, decision trees, and Naive Bayes
Become well-versed in interpreting models with model-agnostic methods
Visualize how an image classifier works and what it learns
Understand how to mitigate the influence of bias in datasets
Discover how to make models more reliable with adversarial robustness
Use monotonic constraints to make fairer and safer models

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

Paradoxalix - Membre du Club

l 27/04/2022 à 20:05

Interpretable Machine Learning with Python

Cet ouvrage est sans doute le plus important à comprendre pour avancer car il examine le maillon faible dans la chaîne de l' I.A. : l'élément humain

En effet les modèles sont sans doute au point depuis des lustres sur le plan statistique , mais sommes-nous suffisamment
formés pour nous en servir ?

Donnons un marteau à un enfant ... et il se tapera sur les doigts !
Donnons des chiffres à un ministre de l'économie ... et il les interprétera comme il le souhaite inconsciemment.

La méthode scientifique exige une rigueur absolue qui n'accepte aucune vérité sans preuve et elle est boulimique en matière de preuve ; j'ajouterai que les nombre en eux-même possèdent un pouvoir de fascination indéniable , séducteurs ils ne demandent qu'à nous induire en erreur quant au sens de nos déductions.

Si nous ne souhaitons pas nous retrouver dans la position ridicule de la plus grande majorité des pronostiqueurs politique ou médicaux , nous avons sans doute le plus grand besoin de lire cet ouvrage au moins sept fois avant d'annoncer un pronostic.

Le remède est douloureux pour notre ego mais salutaire sans le moindre doute ... dans un certain nombre de cas !!!

A son époque, Richelieu disait : "Donnez-moi dix lignes de la main de quelqu’un et je me charge de le faire pendre "
Actuellement il aurait suffit d'un chiffre ...

Cordialement, Paradoxalix

PS je suis en train de le lire à la petite cuillère , nous sommes peu de chose , et c'est dur à encaisser.

couverture du livre Graph Machine Learning

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Graph Machine Learning

Take graph data to the next level by applying machine learning techniques and algorithms

de Claudio Stamile, Aldo Marzullo, Enrico Deusebio

Public visé : Intermédiaire

Résumé de l'éditeur

Graph Machine Learning provides a new set of tools for processing network data and leveraging the power of the relation between entities that can be used for predictive, modeling, and analytics tasks.

You will start with a brief introduction to graph theory and graph machine learning, understanding their potential. As you proceed, you will become well versed with the main machine learning models for graph representation learning: their purpose, how they work, and how they can be implemented in a wide range of supervised and unsupervised learning applications. You'll then build a complete machine learning pipeline, including data processing, model training, and prediction in order to exploit the full potential of graph data. Moving ahead, you will cover real-world scenarios such as extracting data from social networks, text analytics, and natural language processing (NLP) using graphs and financial transaction systems on graphs. Finally, you will learn how to build and scale out data-driven applications for graph analytics to store, query, and process network information, before progressing to explore the latest trends on graphs.

By the end of this machine learning book, you will have learned essential concepts of graph theory and all the algorithms and techniques used to build successful machine learning applications.

Édition : Packt - 338 pages, 1^re édition, 25 juin 2021

ISBN10 : 1800204493 - ISBN13 : 9781800204492

Commandez sur www.amazon.fr :

41.95 € TTC (prix éditeur 41.95 € TTC)

Introduction to Graph Machine Learning

Getting Started with Graphs
Graph Machine Learning

Machine Learning on Graphs

Unsupervised Graph Learning
Supervised Graph Learning
Problems with Machine Learning on Graphs

Advanced Applications of Graph Machine Learning

Social Network Graphs
Text Analytics and Natural Language Processing Using Graphs
Graph Analysis for Credit Card Transactions
Building a Data-Driven Graph-Powered Application
Novel Trends on Graphs

Critique du livre par la rédaction Thibaut Cuvelier le 10 décembre 2021

Dans le domaine de l'intelligence artificielle, l'apprentissage à partir de graphes est un secteur en pleine expansion. Ce type de données est très particulier et nécessite des techniques adaptées, que cet ouvrage couvre en deux parties : l'une plus théorique, sur les algorithmes existants ; l'autre plus pratique, avec un accent particulier sur la modélisation de problèmes sous la forme de graphes.

La première moitié du livre, présentant des algorithmes d'apprentissage supervisé ou non sur des graphes, est la plus décevante des deux. Elle part des bases, sans considérer que le lecteur connaît quoi que ce soit aux graphes, puis introduit progressivement les notions plus avancées requises pour certaines analyses. Les explications des algorithmes sont, au début, très intuitives, puis deviennent de plus en plus vagues quand le sujet devient plus compliqué, en préférant des exemples d'utilisation des algorithmes avec certaines bibliothèques Python à des détails sur leur fonctionnement.

Au contraire, la partie appliquée est bien plus intéressante. Les auteurs présentent des modélisations qui ne sont pas évidentes dans des domaines comme le traitement des langues ou la détection de transactions frauduleuses. Peu de détails sont laissés à l'imagination, tout le code nécessaire pour les analyses et l'obtention de prédictions étant disponible dans le livre.

Au niveau de la mise en page, on regrettera surtout celle des mathématiques, objectivement très mauvaise : elle limite la lisibilité des formules. Le code Python est aussi régulièrement mal indenté (avec des erreurs d'un ou deux espaces), ce qui risque de poser problème à ceux qui recopient le code exactement comme il apparaît dans le titre.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 10/12/2021 à 22:36

Graph Machine Learning
Take graph data to the next level by applying machine learning techniques and algorithms

Graph Machine Learning provides a new set of tools for processing network data and leveraging the power of the relation between entities that can be used for predictive, modeling, and analytics tasks.

You will start with a brief introduction to graph theory and graph machine learning, understanding their potential. As you proceed, you will become well versed with the main machine learning models for graph representation learning: their purpose, how they work, and how they can be implemented in a wide range of supervised and unsupervised learning applications. You'll then build a complete machine learning pipeline, including data processing, model training, and prediction in order to exploit the full potential of graph data. Moving ahead, you will cover real-world scenarios such as extracting data from social networks, text analytics, and natural language processing (NLP) using graphs and financial transaction systems on graphs. Finally, you will learn how to build and scale out data-driven applications for graph analytics to store, query, and process network information, before progressing to explore the latest trends on graphs.

By the end of this machine learning book, you will have learned essential concepts of graph theory and all the algorithms and techniques used to build successful machine learning applications.

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre Reinforcement Learning

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Reinforcement Learning

Industrial Applications of Intelligent Agents

de Phil Winder

Public visé : Débutant

Résumé de l'éditeur

Reinforcement learning (RL) will deliver one of the biggest breakthroughs in AI over the next decade, enabling algorithms to learn from their environment to achieve arbitrary goals. This exciting development avoids constraints found in traditional machine learning (ML) algorithms. This practical book shows data science and AI professionals how to learn by reinforcement and enable a machine to learn by itself.

Learn what RL is and how the algorithms help solve problems
Become grounded in RL fundamentals including Markov decision processes, dynamic programming, and temporal difference learning
Dive deep into a range of value and policy gradient methods
Apply advanced RL solutions such as meta learning, hierarchical learning, multi-agent, and imitation learning
Understand cutting-edge deep RL algorithms including Rainbow, PPO, TD3, SAC, and more
Get practical examples through the accompanying website

Author Phil Winder of Winder Research covers everything from basic building blocks to state-of-the-art practices. You'll explore the current state of RL, focus on industrial applications, learn numerous algorithms, and benefit from dedicated chapters on deploying RL solutions to production. This is no cookbook; doesn't shy away from math and expects familiarity with ML.

Édition : O'Reilly - 381 pages, 1^re édition, 20 novembre 2020

ISBN10 : 1098114833 - ISBN13 : 9781098114831

Commandez sur www.amazon.fr :

40.66 € TTC (prix éditeur 40.66 € TTC)

Why Reinforcement Learning?
Markov Decision Processes, Dynamic Programming, and Monte Carlo Methods
Temporal-Difference Learning, Q-Learning, and n-Step Algorithms
Deep Q-Networks
Policy Gradient Methods
Beyond Policy Gradients
Learning All Possible Policies with Entropy Methods
Improving How an Agent Learns
Practical Reinforcement Learning
Operational Reinforcement Learning

Critique du livre par la rédaction Thibaut Cuvelier le 18 octobre 2021

L'intelligence artificielle a apporté de nombreux progrès scientifiques récents, mais peut rester mystérieuse pour les non-initiés. Ce livre ne considère pas de prérequis particulier dans le domaine et apporte des explications très accessibles pour le domaine de l'apprentissage par renforcement, l'un des piliers actuels des développements récents de l'intelligence artificielle. L'auteur ne partage pas que ses connaissances, une bonne partie de l'ouvrage se focalise sur les applications réelles de l'apprentissage par renforcement. C'est d'ailleurs là l'un de ses points forts, car les derniers chapitres s'intéressent au déploiement industriel de solutions d'apprentissage par renforcement.

Même si l'apprentissage par renforcement est un domaine très mathématique, l'auteur limite l'exposition aux équations le plus possible et ne garde que l'essentiel des développements, tout en ponctuant les formules les plus compliquées avec des intuitions pour en comprendre le sens.

Bien qu'appliqué, cet ouvrage ne présente presque pas de code, uniquement des principes : tout le code se retrouve en ligne, parce qu'il est plus agréable de le lire sur écran que sur papier, notamment pour profiter des dernières mises à jour. Aussi, les mises en pratique sont toujours éclairées par les dernières avancées de la recherche dans le domaine, certains algorithmes présentés n'ayant été développés que ces dernières années. Ce n'est pas pour autant que l'auteur cède aux sirènes journalistiques : les sujets abordés restent les pieds sur terre, sans extrapolation sur les possibilités technologiques, mais avec des considérations éthiques.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 19/10/2021 à 0:09

Reinforcement Learning
Industrial Applications of Intelligent Agents

Reinforcement learning (RL) will deliver one of the biggest breakthroughs in AI over the next decade, enabling algorithms to learn from their environment to achieve arbitrary goals. This exciting development avoids constraints found in traditional machine learning (ML) algorithms. This practical book shows data science and AI professionals how to learn by reinforcement and enable a machine to learn by itself.

Learn what RL is and how the algorithms help solve problems
Become grounded in RL fundamentals including Markov decision processes, dynamic programming, and temporal difference learning
Dive deep into a range of value and policy gradient methods
Apply advanced RL solutions such as meta learning, hierarchical learning, multi-agent, and imitation learning
Understand cutting-edge deep RL algorithms including Rainbow, PPO, TD3, SAC, and more
Get practical examples through the accompanying website

Author Phil Winder of Winder Research covers everything from basic building blocks to state-of-the-art practices. You'll explore the current state of RL, focus on industrial applications, learn numerous algorithms, and benefit from dedicated chapters on deploying RL solutions to production. This is no cookbook; doesn't shy away from math and expects familiarity with ML.

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre The Art of Feature Engineering

Détails du livre

Sommaire

Critiques (1)

0 commentaire

The Art of Feature Engineering

Essentials for Machine Learning

de Pablo Duboue

Public visé : Intermédiaire

Résumé de l'éditeur

'Pablo Duboue is a true grandmaster of the art and science of feature engineering. His foundational contributions to the creation of IBM Watson were a critical component of its success. Now readers can benefit from his expertise. His book provides deep insights into to how to develop, assess, combine, and enhance machine learning features. Of particular interest to advanced practitioners is his discussion of feature engineering and deep learning; there is a pervasive myth in the industry that deep learning and big data have made feature engineering obsolete, but the book explains why that is often incorrect for real-world computing applications and explains the relationship between building effective features and deep neural network architectures. The book engages with countless other basic and advanced topics in the area of machine learning and feature engineering, making it a valuable resource for machine learning practitioners of all levels of experience.' J. William Murdock, IBM

When working with a data set, machine learning engineers might train a model but find that the results are not as good as they need. To get better results, they can try to improve the model or collect more data, but there is another avenue: feature engineering. The feature engineering process can help improve results by modifying the data’s features to better capture the nature of the problem. This process is partly an art and partly a palette of tricks and recipes. This practical guide to feature engineering is an essential addition to any data scientist’s or machine learning engineer’s toolbox, providing new ideas on how to improve the performance of a machine learning solution.

Beginning with the basic concepts and techniques of feature engineering, the text builds up to a unique cross-domain approach that spans data on graphs, texts, time series and images, with fully worked-out case studies. Key topics include binning, out-of-fold estimation, feature selection, dimensionality reduction and encoding variable-length data. The full source code for the case studies is available on a companion website as Python Jupyter notebooks.

Édition : Cambridge University Press - 284 pages, 1^re édition, 25 juin 2020

ISBN10 : 1108709389 - ISBN13 : 9781108709385

Commandez sur www.amazon.fr :

44.77 € TTC (prix éditeur 44.77 € TTC)

Fundamentals

Introduction
Features, Combined: Normalization, Discretization and Outliers
Features, Expanded: Computable Features, Imputation and Kernels
Features, Reduced: Feature Selection, Dimensionality Reduction and Embeddings
Advanced Topics: Variable-Length Data and Automated Feature Engineering

Case Studies

Graph Data
Time stamped Data
Textual Data
Image Data
Other Domains: Video, GIS and Preferences

Critique du livre par la rédaction Thibaut Cuvelier le 12 février 2021

Certains compétiteurs sur Kaggle disent que la grande différence entre les gagnants et les autres, ce ne sont pas les compétences dans le réglage des algorithmes d'apprentissage, mais bien dans la création de nouvelles variables, c'est-à-dire dans l'art de l'ingénierie des caractéristiques. C'est justement de ce sujet précis que traite cet ouvrage.

Le livre est en bonne partie construit sur des études de cas : elles constituent la deuxième partie, où les différentes techniques présentées dans la première sont mises en pratique et comparées. L'auteur ne cherche d'ailleurs pas qu'à y montrer ce qui fonctionne bien, car la plupart des essais, dans la pratique industrielle, ne donnent pas les résultats escomptés. Toutes ces études de cas partent d'une même utilisation, l'estimation de la population de villes, à partir de données différentes (tabulaires, textuelles, graphiques, etc.).

La première partie se focalise sur des manières génériques de traiter un jeu de données. Elle est orientée vers la méthodologie derrière le calcul de nouvelles caractéristiques plutôt que sur le code pour réaliser les opérations mathématiques. La seconde moitié de l'ouvrage présente aussi des méthodes moins génériques, mais plutôt conçues à partir de connaissances fines du domaine d'application.

Bon nombre de techniques présentées sont véritablement à la pointe de la recherche scientifique dans le domaine. L'auteur a d'ailleurs inclus des centaines de références vers la littérature scientifique (dont les méthodes ne sont pas toujours expliquées en détail dans le livre). On peut cependant regretter que les techniques proposées ne soient pas comparées d'un point de vue mathématique, mais uniquement numérique, d'une manière qui n'est donc pas forcément scientifique (mais ce n'est pas l'objectif de l'ouvrage).

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 12/02/2021 à 23:55

The Art of Feature Engineering
Essentials for Machine Learning

'Pablo Duboue is a true grandmaster of the art and science of feature engineering. His foundational contributions to the creation of IBM Watson were a critical component of its success. Now readers can benefit from his expertise. His book provides deep insights into to how to develop, assess, combine, and enhance machine learning features. Of particular interest to advanced practitioners is his discussion of feature engineering and deep learning; there is a pervasive myth in the industry that deep learning and big data have made feature engineering obsolete, but the book explains why that is often incorrect for real-world computing applications and explains the relationship between building effective features and deep neural network architectures. The book engages with countless other basic and advanced topics in the area of machine learning and feature engineering, making it a valuable resource for machine learning practitioners of all levels of experience.' J. William Murdock, IBM

When working with a data set, machine learning engineers might train a model but find that the results are not as good as they need. To get better results, they can try to improve the model or collect more data, but there is another avenue: feature engineering. The feature engineering process can help improve results by modifying the data’s features to better capture the nature of the problem. This process is partly an art and partly a palette of tricks and recipes. This practical guide to feature engineering is an essential addition to any data scientist’s or machine learning engineer’s toolbox, providing new ideas on how to improve the performance of a machine learning solution.

Beginning with the basic concepts and techniques of feature engineering, the text builds up to a unique cross-domain approach that spans data on graphs, texts, time series and images, with fully worked-out case studies. Key topics include binning, out-of-fold estimation, feature selection, dimensionality reduction and encoding variable-length data. The full source code for the case studies is available on a companion website as Python Jupyter notebooks.

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Bandit Algorithms

de Tor Lattimore et Csaba Szepesvári

Public visé : Intermédiaire

Résumé de l'éditeur

Decision-making in the face of uncertainty is a significant challenge in machine learning, and the multi-armed bandit model is a commonly used framework to address it. This comprehensive and rigorous introduction to the multi-armed bandit problem examines all the major settings, including stochastic, adversarial, and Bayesian frameworks. A focus on both mathematical intuition and carefully worked proofs makes this an excellent reference for established researchers and a helpful resource for graduate students in computer science, engineering, statistics, applied mathematics and economics. Linear bandits receive special attention as one of the most useful models in applications, while other chapters are dedicated to combinatorial bandits, ranking, non-stationary problems, Thompson sampling and pure exploration. The book ends with a peek into the world beyond bandits with an introduction to partial monitoring and learning in Markov decision processes.

Édition : Cambridge University Press - 536 pages, 1^re édition, 1^er juillet 2020

ISBN10 : 1108486827 - ISBN13 : 9781108486828

Commandez sur www.amazon.fr :

44.07 € TTC (prix éditeur 44.07 € TTC)

Bandits, Probability and Concentration

Introduction
Foundations of Probability
Stochastic Processes and Markov Chains
Stochastic Bandits
Concentration of Measure

Stochastic Bandits with Finitely Many Arms

The Explore-Then-Commit Algorithm
The Upper Confidence Bound Algorithm
The Upper Confidence Bound Algorithm: Asymptotic Optimality
The Upper Confidence Bound Algorithm: Minimax Optimality
The Upper Confidence Bound Algorithm: Bernoulli Noise

Adversarial Bandits with Finitely Many Arms

The Exp3 Algorithm
The Exp3-IX Algorithm

Lower Bounds for Bandits with Finitely Many Arms

Lower Bounds: Basic Ideas
Foundations of Information Theory
Minimax Lower Bounds
Instance-Dependent Lower Bounds
High-Probability Lower Bounds

Contextual and Linear Bandits

Contextual Bandits
Stochastic Linear Bandits
Confidence Bounds for Least Squares Estimators
Optimal Design for Least Squares Estimators
Stochastic Linear Bandits with Finitely Many Arms
Stochastic Linear Bandits with Sparsity
Minimax Lower Bounds for Stochastic Linear Bandits
Asymptotic Lower Bounds for Stochastic Linear Bandits

Adversarial Linear Bandits

Foundations of Convex Analysis
Exp3 for Adversarial Linear Bandits
Follow-the-regularised-Leader and Mirror Descent
The Relation between Adversarial and Stochastic Linear Bandits

Other Topics

Combinatorial Bandits
Non-stationary Bandits
Ranking
Pure Exploration
Foundations of Bayesian Learning
Bayesian Bandits
Thompson Sampling

Beyond Bandits

Partial Monitoring
Markov Decision Processes

Critique du livre par la rédaction Thibaut Cuvelier le 25 janvier 2021

Dans le vaste domaine de l'intelligence artificielle, on parle de plus en plus d'apprentissage par renforcement pour les situations où un système apprend à interagir avec son environnement (l'exemple typique étant la voiture autonome). Les problèmes de bandit sont une classe particulière d'apprentissage par renforcement, avec une simplification majeure (il n'y a plus d'état) qui permet une étude théorique extrêmement poussée.

Cet ouvrage se veut être une bible des algorithmes de bandit, surtout écrite pour découvrir le domaine : autant que possible, les algorithmes et les théorèmes sont expliqués de manière intuitive. Les auteurs proposent une grande famille d'algorithmes pour des situations très variées, mais n'oublient pas le côté pratique avec des exemples réalistes d'applications. Toutefois, le public cible a très clairement de bonnes bases en mathématiques, plus qu'en informatique.

Chaque chapitre se termine par une série d'exercices bien ficelés, même si on peut regretter qu'il n'y ait pas de correction. Les preuves sont présentes en nombre, mais sont toujours détaillées, pour faciliter la compréhension des compromis mis en place pour chaque algorithme. Les auteurs mettent un point d'orgue à expliciter les hypothèses, chapitre par chapitre.

Les premiers chapitres forment un résumé des probabilités du point de vue de la théorie de la mesure : l'exposé n'est pas forcément limpide (car succinct), mais permet de généraliser les probabilités aux cas nécessaires pour l'étude des bandits. Les chapitres sont tous très petits, avec un découpage qui semble parfois arbitraire.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 25/01/2021 à 20:22

Decision-making in the face of uncertainty is a significant challenge in machine learning, and the multi-armed bandit model is a commonly used framework to address it. This comprehensive and rigorous introduction to the multi-armed bandit problem examines all the major settings, including stochastic, adversarial, and Bayesian frameworks. A focus on both mathematical intuition and carefully worked proofs makes this an excellent reference for established researchers and a helpful resource for graduate students in computer science, engineering, statistics, applied mathematics and economics. Linear bandits receive special attention as one of the most useful models in applications, while other chapters are dedicated to combinatorial bandits, ranking, non-stationary problems, Thompson sampling and pure exploration. The book ends with a peek into the world beyond bandits with an introduction to partial monitoring and learning in Markov decision processes.

Critique

couverture du livre Machine Learning Under a Modern Optimization Lens

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Machine Learning Under a Modern Optimization Lens

de Dimitris Bertsimas et Jack Dunn

Public visé : Intermédiaire

Résumé de l'éditeur

The book provides an original treatment of machine learning (ML) using convex, robust and mixed integer optimization that leads to solutions to central ML problems at large scale that can be found in seconds/minutes, can be certified to be optimal in minutes/hours, and outperform classical heuristic approaches in out-of-sample experiments.

Structure of the book:

Part I covers robust, sparse, nonlinear, holistic regression and extensions.
Part II contains optimal classification and regression trees.
Part III outlines prescriptive ML methods.
Part IV shows the power of optimization over randomization in design of experiments, exceptional responders, stable regression and the bootstrap.
Part V describes unsupervised methods in ML: optimal missing data imputation and interpretable clustering.
Part VI develops matrix ML methods: sparse PCA, sparse inverse covariance estimation, factor analysis, matrix and tensor completion
Part VII demonstrates how ML leads to interpretable optimization.

Philosophical principles of the book:

Interpretability is materially important in the real world.
Practical tractability not polynomial solvability leads to real world impact.
NP-hardness is an opportunity not an obstacle.
ML is inherently linked to optimization not probability theory.
Data represent an objective reality; models only exist in our imagination.
Optimization has a significant edge over randomization
The ultimate objective in the real world is prescription, not prediction.

Édition : Dynamic Ideas - 589 pages, 1^re édition, 1^er janvier 2019

ISBN10 : 1733788506 - ISBN13 : 9781733788502

Commandez sur www.amazon.fr :

274.99 $ TTC (prix éditeur 94.99 $ TTC)

The Optimization Lenses
Robust Regression
Sparse Regression
Nonlinear Regression
Holistic Regression
Sparse and Robust Classification
Classification and Regression Trees
Optimal Classification Trees with Parallel Splits
Optimal Classification Trees with Hyperplane Splits
Optimal Regression Trees with Constant Predictions
Optimal Regression Trees with Linear Predictions
Optimal Trees and Neural Networks
From Predictive to Prescriptive Analytics
Optimal Prescriptive Trees
Optimal Design of Experiments
Identifying Exceptional Responders
Stable Regression
The Bootstrap
Optimal Missing Data Imputation
Interpretable Clustering
Sparse Principal Component Analysis
Factor Analysis
Sparse Inverse Covariance Estimation
Interpretable Matrix Completion
Tensor Learning
Interpretable Optimization

Critique du livre par la rédaction Thibaut Cuvelier le 6 décembre 2020

Les algorithmes d'apprentissage automatique actuels ont été développés il y a plusieurs décennies et n'ont vu que des modifications assez mineures depuis lors. Ainsi, ces algorithmes exploitent les possibilités d'optimisation disponibles à cette époque. Le problème est que, depuis lors, les outils d'optimisation (surtout en nombres entiers et avec des matrices semi-définies positives) ont fait d'énormes progrès : au lieu de n'utiliser que des heuristiques (qui délivrent rapidement des modèles de qualité très variable), ce livre propose des méthodes qui profitent des dernières avancées dans le domaine (des méthodes qui apportent des solutions optimales, impossibles à améliorer). Les pans actuels de l'intelligence artificielle sont couverts, des techniques d'apprentissage (régression, arbres, réseaux neuronaux) au partitionnement de données et au scénarios d'utilisation antagoniste.

Ce parti pris n'est pas encore suivi par la communauté de l'intelligence artificielle au sens large, une bonne partie du contenu correspond au travail de recherche des deux auteurs et de leurs collaborateurs (même s'ils n'accaparent pas la longue liste de références, très diverse). En particulier, cela signifie qu'il n'existe pas encore d'implémentation répandue pour la plupart des algorithmes présentés.

Chaque technique est présentée d'abord en partant des principes fondamentaux, avec une explication brève des techniques (heuristiques) actuelles, mais aussi de leurs inconvénients (souvent, l'interprétabilité des modèles obtenus). Les auteurs mettent l'accent sur les principes, plutôt que sur les détails algorithmiques d'implémentation (même si les principaux sont expliqués). Chaque technique est étudiée de manière approfondie, avec des études de cas pour montrer leur apport par rapport aux heuristiques actuelles, mais aussi des études numériques sur des données synthétiques pour comparer la performance des algorithmes optimaux par rapport aux heuristiques courantes. Selon les cas, des algorithmes polynomiaux sont proposés, mais l'objectif est de présenter des algorithmes utiles en pratique, pas uniquement en théorie.

Il ne s'agit pas du tout d'un ouvrage de probabilités ou de statistiques, contrairement à la majorité de la littérature sur l'apprentissage : les auteurs adoptent principalement le point de vue de l'optimisation pour résoudre les problèmes d'apprentissage, ce qui donne un éclairage totalement différent. Ils adoptent un style très rigoureux, mais le texte reste lisible (il n'est pas encombré de théorèmes et résultats peu importants). Toutefois, pour bien profiter du contenu, une bonne base en apprentissage est requise ; par dessus tout, de vraies compétences en optimisation (convexe, en nombres entiers, principalement) faciliteront la compréhension.

Le livre se prête aussi à une utilisation détournée, qui est celle d'un livre avancé sur l'optimisation. On peut ainsi l'utiliser comme une série de tutoriels de méthodes plus ou moins avancées pour résoudre des problèmes d'optimisation complexes.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 06/12/2020 à 5:13

Machine Learning Under a Modern Optimization Lens

The book provides an original treatment of machine learning (ML) using convex, robust and mixed integer optimization that leads to solutions to central ML problems at large scale that can be found in seconds/minutes, can be certified to be optimal in minutes/hours, and outperform classical heuristic approaches in out-of-sample experiments.

Structure of the book:

Part I covers robust, sparse, nonlinear, holistic regression and extensions.
Part II contains optimal classification and regression trees.
Part III outlines prescriptive ML methods.
Part IV shows the power of optimization over randomization in design of experiments, exceptional responders, stable regression and the bootstrap.
Part V describes unsupervised methods in ML: optimal missing data imputation and interpretable clustering.
Part VI develops matrix ML methods: sparse PCA, sparse inverse covariance estimation, factor analysis, matrix and tensor completion
Part VII demonstrates how ML leads to interpretable optimization.

Philosophical principles of the book:

Interpretability is materially important in the real world.
Practical tractability not polynomial solvability leads to real world impact.
NP-hardness is an opportunity not an obstacle.
ML is inherently linked to optimization not probability theory.
Data represent an objective reality; models only exist in our imagination.
Optimization has a significant edge over randomization
The ultimate objective in the real world is prescription, not prediction.

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Ensemble Learning

Pattern Classification Using Ensemble Methods

de Lior Rokach

Public visé : Intermédiaire

Résumé de l'éditeur

This updated compendium provides a methodical introduction with a coherent and unified repository of ensemble methods, theories, trends, challenges, and applications. More than a third of this edition comprised of new materials, highlighting descriptions of the classic methods, and extensions and novel approaches that have recently been introduced.

Along with algorithmic descriptions of each method, the settings in which each method is applicable and the consequences and tradeoffs incurred by using the method is succinctly featured. R code for implementation of the algorithm is also emphasized.

The unique volume provides researchers, students and practitioners in industry with a comprehensive, concise and convenient resource on ensemble learning methods.

Édition : WorldScientific - 300 pages, 2^e édition, 19 février 2019

ISBN10 : 9811201951 - ISBN13 : 9789811201950

Commandez sur www.amazon.fr :

122.31 € TTC (prix éditeur 116.50 € TTC)

Introduction to Machine Learning
Classification and Regression Trees
Introduction to Ensemble Learning
Ensemble Classification
Gradient Boosting Machines
Ensemble Diversity
Ensemble Selection
Error Correcting Output Codes
Evaluating Ensembles of Classifiers

Critique du livre par la rédaction Thibaut Cuvelier le 22 août 2020

Parmi les stratégies d'apprentissage en intelligence artificielle, les méthodes d'ensemble sont souvent plébiscitées. De fait, elles ont un pouvoir prédictif très fort, sans autant présenter une performance très faible lors de l'entraînement. Ce livre présente l'état actuel des connaissances dans ce domaine, sans nécessiter de grands prérequis : quelques notions d'apprentissage automatique, notamment avec des arbres de décision, et un peu de probabilités, c'est tout ce qu'il faut pour la lecture (même si un niveau plus avancé permettra d'en tirer plus d'enseignements).

L'auteur utilise un niveau de formalisation assez élevé, ce qui risque de rebuter les personnes allergiques aux mathématiques. Toutes les notations sont bien expliquées, le pseudocode est clair. Le long de l'ouvrage, on passe d'éléments très pratiques (mise en œuvre d'un algorithme donné en R) à d'autres, plus théoriques (comme les raisons pour lesquelles les méthodes d'ensemble devraient bien fonctionner, notamment à travers les liens avec la théorie de l'apprentissage). Le livre, assez complet, contient également une bibliographie scientifique fournie, qui oscille entre articles précurseurs incontournables et études numériques.

Le livre est bien ancré dans le présent, il présente assez longuement les méthodes les plus utilisées actuellement (comme GBM et son implémentation à base d'histogrammes). À l'occasion, l'auteur propose aussi l'une ou l'autre méthodologie pas encore présente dans la littérature. Outre les arbres (utilisés dans presque toutes les méthodes d'ensemble), l'ouvrage présente également des techniques à base de réseaux neuronaux. Le chapitre sur l'évaluation est très générique, il ne présente que peu d'éléments spécifiques aux méthodes d'ensemble. Sa section sur l'interprétation est décevante, elle n'introduit même pas la notion d'importance ou de modèle local (comme LIME).

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 22/08/2020 à 0:29

Ensemble Learning
Pattern Classification Using Ensemble Methods

This updated compendium provides a methodical introduction with a coherent and unified repository of ensemble methods, theories, trends, challenges, and applications. More than a third of this edition comprised of new materials, highlighting descriptions of the classic methods, and extensions and novel approaches that have recently been introduced.

Along with algorithmic descriptions of each method, the settings in which each method is applicable and the consequences and tradeoffs incurred by using the method is succinctly featured. R code for implementation of the algorithm is also emphasized.

The unique volume provides researchers, students and practitioners in industry with a comprehensive, concise and convenient resource on ensemble learning methods.

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre Handbook of Machine Learning

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Handbook of Machine Learning

Volume 2: Optimization and Decision Making

de Tshilidzi Marwala, Collins Achepsah Leke

Public visé : Intermédiaire

Résumé de l'éditeur

Building on Handbook of Machine Learning - Volume 1: Foundation of Artificial Intelligence, this volume on Optimization and Decision Making covers a range of algorithms and their applications. Like the first volume, it provides a starting point for machine learning enthusiasts as a comprehensive guide on classical optimization methods. It also provides an in-depth overview on how artificial intelligence can be used to define, disprove or validate economic modeling and decision making concepts.

Édition : WorldScientific - 320 pages, 1^re édition, 21 décembre 2019

ISBN10 : 9811205663 - ISBN13 : 9789811205668

Commandez sur www.amazon.fr :

113.70 € TTC (prix éditeur 113.70 € TTC)

Introduction
Classical Optimization
Genetic Algorithm
Particle Swarm Optimization
Simulated Annealing
Response Surface Method
Ant Colony Optimization
Bat and Firefly Algorithms
Artificial Immune System
Invasive Weed Optimization and Cuckoo Search Algorithms
Decision Trees and Random Forests
Hybrid Methods
Economic Modeling
Condition Monitoring
Rational Decision-Making
Conclusion Remarks

Critique du livre par la rédaction Thibaut Cuvelier le 9 août 2020

La science des données s'axe souvent sur la compréhension de données, mais ignore trop souvent la prise de décision : comment mettre en pratique les résultats d'un modèle d'apprentissage ? Comme le premier volume, les auteurs proposent une série d'études de cas rassemblés par technique, en exploitant les techniques du premier volume. On voit ainsi une série d'exemple où ces techniques sont appliquées et les résultats auxquels on peut arriver.

Cet ouvrage traite d'un grand nombre d'algorithmes d'optimisation : les algorithmes classiques d'optimisation continue, mais aussi toute la panoplie actuelle des métaheuristiques. En particulier, aucune méthode exacte n'est présentée pour les problèmes non convexes (alors qu'elles existent depuis belle lurette et sont efficaces pour un très grand nombre de cas pratiques). Ces techniques sont expliquées de manière très concise, mais claire.

Tout comme le premier volume, l'une des principales contributions de ce livre est, sans conteste, la quantité de références incluses, tant pour les algorithmes que leurs applications (qui représentent l'essentiel des références). Toutefois, les auteurs sont très convaincus par les métaheuristiques, au point d'en énoncé des contre-vérités : aucun de ces algorithmes ne peut trouver, à coup sûr, de solution globalement optimale.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 08/08/2020 à 1:40

Handbook of Machine Learning
Volume 2: Optimization and Decision Making

Building on Handbook of Machine Learning - Volume 1: Foundation of Artificial Intelligence, this volume on Optimization and Decision Making covers a range of algorithms and their applications. Like the first volume, it provides a starting point for machine learning enthusiasts as a comprehensive guide on classical optimization methods. It also provides an in-depth overview on how artificial intelligence can be used to define, disprove or validate economic modeling and decision making concepts.

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Handbook of Machine Learning

Volume 1: Foundation of Artificial Intelligence

de Tshilidzi Marwala

Public visé : Intermédiaire

Résumé de l'éditeur

This is a comprehensive book on the theories of artificial intelligence with an emphasis on their applications. It combines fuzzy logic and neural networks, as well as hidden Markov models and genetic algorithm, describes advancements and applications of these machine learning techniques and describes the problem of causality. This book should serves as a useful reference for practitioners in artificial intelligence.

Édition : WorldScientific - 328 pages, 1^re édition, 22 octobre 2018

ISBN10 : 9813271221 - ISBN13 : 9789813271227

Commandez sur www.amazon.fr :

128.01 € TTC (prix éditeur 128.01 € TTC)

Introduction
Multi-layer Perceptron
Radial Basis Function
Automatic Relevance Determination
Bayesian Networks
Support Vector Machines
Fuzzy Logic
Rough Sets
Hybrid Machines
Auto-associative Networks
Evolving Networks
Causality
Gaussian Mixture Models
Hidden Markov Models
Reinforcement Learning
Conclusion Remarks

Critique du livre par la rédaction Thibaut Cuvelier le 7 août 2020

L'intelligence artificielle est la tarte à la crème du jour, avec moult livres (entre autres supports) qui traitent du sujet, de manière plus ou moins appropriée. Ce livre prend un point de vue assez différent de la majorité : il ne se focalise pas sur les algorithmes possibles ou sur des bibliothèques qui les implémentent, mais bien sur les applications de ces techniques (académiques, pour l'écrasante majorité). Quand beaucoup de ressources sur le sujet se focalisent sur la manière d'arriver à un objectif donné, l'auteur présente des résultats auxquels on peut arriver, en pratique.

La variété de domaines abordés impressionne, on a vraiment une vue d'ensemble de toute une série de techniques d'intelligence artificielle. Ce panorama n'est pas complet, bien sûr (on ne voit presque pas d'utilisation des arbres de décision, par exemple), mais il inclut des outils peu connus, comme ARD pour l'interprétabilité de réseaux neuronaux ; les statistiques bayésiennes sont bien représentées, y compris dans les réseaux neuronaux. Beaucoup de sens de l'expression « intelligence artificielle » sont décrits, y compris la logique floue et les techniques évolutionnaires. Toutefois, le chapitre sur les réseaux neuronaux est quelque peu décevant, il semble s'être arrêté avant les développements de ces dix dernières années (uniquement de petits réseaux). L'auteur inclut toujours les RBF, une technique tombée en désuétude (même si elle reste utile).

L'une des principales contributions de ce livre est, sans conteste, la quantité de références incluses, tant pour les algorithmes que leurs applications (qui représentent l'essentiel des références). Certaines références ne sont pas forcément très adaptées au contexte (articles de recherche opérationnelle pour de l'apprentissage, notamment) ; l'auteur de l'ouvrage se retrouve surreprésenté dans les citations.

L'ouvrage débute sur les chapeaux de roues : très clairement, le lecteur est censé avoir déjà une certaine expérience dans le domaine, sans quoi il sera directement perdu. On attend notamment de lui certaines connaissances en statistiques, en processus stochastiques et en traitement des données.

Les explications d'algorithmes sont toujours concises, aucune technique n'a droit à plus d'une dizaine de pages. Ce niveau de concision est parfois problématique pour des sujets plus techniques, comme l'inférence de causalité. Les applications sont elles aussi résumées, mais de manière moins sommaire : ces parties peuvent donner des idées pour nourrir sa pratique quotidienne de science des données.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 07/08/2020 à 1:51

Handbook of Machine Learning
Volume 1: Foundation of Artificial Intelligence

This is a comprehensive book on the theories of artificial intelligence with an emphasis on their applications. It combines fuzzy logic and neural networks, as well as hidden Markov models and genetic algorithm, describes advancements and applications of these machine learning techniques and describes the problem of causality. This book should serves as a useful reference for practitioners in artificial intelligence.

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre Fundamentals of Data Visualization

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Fundamentals of Data Visualization

A Primer on Making Informative and Compelling Figures

de Claus O. Wilke

Public visé : Intermédiaire

Résumé de l'éditeur

Effective visualization is the best way to communicate information from the increasingly large and complex datasets in the natural and social sciences. But with the increasing power of visualization software today, scientists, engineers, and business analysts often have to navigate a bewildering array of visualization choices and options.

This practical book takes you through many commonly encountered visualization problems, and it provides guidelines on how to turn large datasets into clear and compelling figures. What visualization type is best for the story you want to tell? How do you make informative figures that are visually pleasing? Author Claus O. Wilke teaches you the elements most critical to successful data visualization.

Explore the basic concepts of color as a tool to highlight, distinguish, or represent a value
Understand the importance of redundant coding to ensure you provide key information in multiple ways
Use the book’s visualizations directory, a graphical guide to commonly used types of data visualizations
Get extensive examples of good and bad figures
Learn how to use figures in a document or report and how employ them effectively to tell a compelling story

Édition : O'Reilly - 300 pages, 1^re édition, 23 avril 2019

ISBN10 : 1492031089 - ISBN13 : 9781492031079

Commandez sur www.amazon.fr :

43.19 € TTC (prix éditeur 43.19 € TTC)

Introduction
From Data to Visualization
Visualizing Data: Mapping Data onto Aesthetics
Coordinate Systems and Axes
Color Scales
Directory of Visualizations
Visualizing Amounts
Visualizing Distributions: Histograms and Density Plots
Visualizing Distributions: Empirical Cumulative Distribution Functions and Q-Q Plots
Visualizing Many Distributions at Once
Visualizing Proportions
Visualizing Nested Proportions
Visualizing Associations Among Two or More Quantitative Variables
Visualizing Time Series and Other Functions of an Independent Variable
Visualizing Trends
Visualizing Geospatial Data
Visualizing Uncertainty
Principles of Figure Design
The Principle of Proportional Ink
Handling Overlapping Points
Common Pitfalls of Color Use
Redundant Coding
Multipanel Figures
Titles, Captions, and Tables
Balance the Data and the Context
Use Larger Axis Labels
Avoid Line Drawings
Don’t Go 3D
Miscellaneous Topics
Understanding the Most Commonly Used Image File Formats
Choosing the Right Visualization Software
Telling a Story and Making a Point

Critique du livre par la rédaction Thibaut Cuvelier le 18 juillet 2020

La visualisation est souvent le parent pauvre de l'enseignement scientifique : les étudiants sont censés savoir quand une visualisation est bonne, les formations se limitent à leur montrer comment réaliser l'une ou l'autre figure avec un logiciel donné. Par la suite, devenus professionnels, ces mêmes étudiants n'auront bien souvent pas les moyens de corriger le tir si nécessaire, à moins de s'y investir fortement. Ce livre cherche à fournir tous les éléments pour réaliser des visualisations efficaces qui transmettent le message prévu. Ce faisant, il ne montre que très peu de mathématiques et aucun bout de code, il n'utilise pas trop de jargon, pour s'ouvrir à un large public.

L'ouvrage est divisé en deux parties principales. La première se focalise sur les types de graphiques que l'on peut créer, les situations dans lesquelles ils s'avèrent intéressants. La deuxième partie est plus précise : elle traite le choix des couleurs et la préparation d'un graphique adapté aux daltoniens, par exemple.

Chaque chapitre est fortement illustré avec de bons et moins bons exemples de graphiques. Surtout, à chaque fois, l'auteur indique clairement ce qu'il en pense (clairement mauvais, simplement laid), mais aussi ses raisons (utilisation hasardeuse des couleurs et tailles, adéquation au type de données à représenter, etc.) ; ensuite, il donne des indications pour améliorer ces graphiques. Ces exemples ne sont pas abstraits, ils pourraient très bien provenir de la pratique quotidienne. Sur ces points, le livre se distingue de certains de ses concurrents, moins appliqués.

L'auteur suppose un certain niveau de connaissance d'un logiciel pour créer des graphiques (ne fût-ce qu'un tableur comme Excel, même s'il le décourage), car il ne parle jamais d'un seul logiciel en particulier. C'est à l'utilisateur de transposer les conseils donnés dans sa pratique de tous les jours. Le vocabulaire utilisé fait souvent penser à Leland Wilkinson (Grammar of Graphics, dont les principes sont repris par des bibliothèques comme ggplot ou plotnine). Le code utilisé pour générer toutes les visualisations est disponible en ligne (avec ggplot2 en R).

En termes de qualité esthétique, le livre est imprimé sur un papier de bonne qualité et en couleur, ce qui le rend très agréable à tenir en main.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 18/07/2020 à 4:34

Fundamentals of Data Visualization
A Primer on Making Informative and Compelling Figures

Effective visualization is the best way to communicate information from the increasingly large and complex datasets in the natural and social sciences. But with the increasing power of visualization software today, scientists, engineers, and business analysts often have to navigate a bewildering array of visualization choices and options.

This practical book takes you through many commonly encountered visualization problems, and it provides guidelines on how to turn large datasets into clear and compelling figures. What visualization type is best for the story you want to tell? How do you make informative figures that are visually pleasing? Author Claus O. Wilke teaches you the elements most critical to successful data visualization.

Explore the basic concepts of color as a tool to highlight, distinguish, or represent a value
Understand the importance of redundant coding to ensure you provide key information in multiple ways
Use the book’s visualizations directory, a graphical guide to commonly used types of data visualizations
Get extensive examples of good and bad figures
Learn how to use figures in a document or report and how employ them effectively to tell a compelling story

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Practical Tableau

100 Tips, Tutorials, and Strategies from a Tableau Zen Master

de Ryan Sleeper

Public visé : Intermédiaire

Résumé de l'éditeur

Whether you have some experience with Tableau software or are just getting started, this manual goes beyond the basics to help you build compelling, interactive data visualization applications. Author Ryan Sleeper, one of the world’s most qualified Tableau consultants, complements his web posts and instructional videos with this guide to give you a firm understanding of how to use Tableau to find valuable insights in data.

Over five sections, Sleeper—recognized as a Tableau Zen Master, Tableau Public Visualization of the Year author, and Tableau Iron Viz Champion—provides visualization tips, tutorials, and strategies to help you avoid the pitfalls and take your Tableau knowledge to the next level.

Practical Tableau sections include:

Fundamentals: get started with Tableau from the beginning
Chart types: use step-by-step tutorials to build a variety of charts in Tableau
Tips and tricks: learn innovative uses of parameters, color theory, how to make your Tableau workbooks run efficiently, and more
Framework: explore the INSIGHT framework, a proprietary process for building Tableau dashboards
Storytelling: learn tangible tactics for storytelling with data, including specific and actionable tips you can implement immediately

Édition : O'Reilly - 624 pages, 1^re édition, 8 mai 2018

ISBN10 : 1491977310 - ISBN13 : 9781491977316

Commandez sur www.amazon.fr :

29.41 € TTC (prix éditeur 29.41 € TTC)

Fundamentals

How to Learn Tableau: My Top Five Tips
Tip #5: Follow the Community
Tip #4: Take a Training Class
Tip #3: Read Up
Tip #2: Practice
Tip #1: Tableau Public
Which Tableau Product Is Best for Me?
Tableau Desktop: Personal
Tableau Desktop: Professional
Tableau Reader
Tableau Public
Tableau Online
Tableau Server
An Introduction to Connecting to Data
Shaping Data for Use with Tableau
Getting a Lay of the Land
Tableau Terminology
View the Underlying Data
View the Number of Records
Dimension Versus Measure
What Is a Measure?
What Is a Dimension?
Discrete Versus Continuous
Five Ways to Make a Bar Chart/An Introduction to Aggregation
Five Ways to Create a Bar Chart in Tableau
An Introduction to Aggregation in Tableau
Line Graphs, Independent Axes, and Date Hierarchies
How to Make a Line Graph in Tableau
Independent Axes in Tableau
Date Hierarchies in Tableau
Marks Cards, Encoding, and Level of Detail
An Explanation of Level of Detail
An Introduction to Encoding
Label and Tooltip Marks Cards
An Introduction to Filters
Dimension Filters in Tableau
Measure Filters in Tableau
More Options with Filters
An Introduction to Calculated Fields
Why Use Calculated Fields?
More on Aggregating Calculated Fields
An Introduction to Table Calculations
An Introduction to Parameters
An Introduction to Sets
How to Create a Set in Tableau
Five Ways to Use Tableau Sets
An Introduction to Level of Detail Expressions
An Introduction to Dashboards and Distribution
An Introduction to Dashboards in Tableau
Distributing Tableau Dashboards

Chart Types

A Spreadsheet Is Not a Data Visualization
How to Make a Highlight Table
How to Make a Heat Map
How to Make a Dual-Axis Combination Chart
How to Make a Scatter Plot
How to Make a Tree Map
How to Make Sparklines
How to Make Small Multiples
How to Make Bullet Graphs
How to Make a Stacked Area Chart
How to Make a Histogram
How to Make a Box-and-Whisker Plot
How to Make a Symbol Map with Mapbox Integration
How to Make a Filled Map
How to Make a Dual-Axis Map
How to Map a Sequential Path
How to Map Anything in Tableau
How to Make Custom Polygon Maps
How to Make a Gantt Chart
How to Make a Waterfall Chart
How to Make Dual-Axis Slope Graphs
How to Make Donut Charts
How to Make Funnel Charts
Introducing Pace Charts in Tableau
How to Make a Pareto Chart
How to Make a Control Chart
How to Make Dynamic Dual-Axis Bump Charts
How to Make Dumbbell Charts
How and Why to Make Customizable Jitter Plots

Tips and Tricks

How to Create Icon-Based Navigation or Filters
How to Make a What-If Analysis Using Parameters
Three Ways to Add Alerts to Your Dashboards
Alert 1: Date Settings
Alert 2: Dynamic Labels
Alert 3: Heat Map Dashboard with Optional Tableau Server Email
How to Add Instructions or Methodology Using Custom Shape Palettes
Ten Tableau Data Visualization Tips I Learned from Google Analytics
Use a Maximum of 12 Dashboard Objects
Improve User Experience by Leveraging Dashboard Actions
Allow End Users to Change the Date Aggregation of Line Graphs
Keep Crosstab Widths to a Maximum of Ten Columns
Use a Vertical Navigation in the Left Column
Choose Five or Fewer Colors for Your Dashboards
Stick Mostly to Lines and Bars
Include Comparisons Such as Year Over Year
Bring Your Data Visualization to Life Using Segmentation
Include Alerts of Exceptional or Poor Performance
Three Alternative Approaches to Pie Charts in Tableau
Tableau Pie Chart Alternative #1: Bar Chart
Tableau Pie Chart Alternative #2: Stacked Bars or Areas
Tableau Pie Chart Alternative #3: My Recommended Approach
How to Create and Compare Segments
Five Design Tips for Enhancing Your Tableau Visualizations
Color
Typography
Layout
Usability
Details
Leveraging Color to Improve Your Data Visualization
The Color Wheel: Where It All Begins
The Psychology of Color
Using Custom Color Palettes in Tableau
Three Creative Ways to Use Dashboard Actions
A Primer on Tableau Dashboard Actions
Tableau Dashboard Action #1: Use Every Sheet as a Filter
Tableau Dashboard Action #2: Embed YouTube Videos in a Dashboard
Tableau Dashboard Action #3: Do a Google Search or Google Image Search from a Dashboard
How to Conditionally Format Individual Rows or Columns
How to Use Legends Per Measure
How to Conditionally Format in Tableau Like Excel
The Solution: A Calculated “Placeholder” Field
Five Tips for Creating Efficient Workbooks
Using Level of Detail Expressions to Create Benchmarks
Designing Device-Specific Dashboards
How to Make a Stoplight 100-Point Index
What Is a Stoplight Index?
Why Do I Have to Use the Fancy Approach You’re About to Share?
How to Set Up a 100-Point Index
Adding Color to a 100-Point Index Table
What If Outperforming the Comparison Is Bad?
The Case for One-Dimensional Unit Charts
How to Highlight a Dimension
Allow Users to Choose Measures and Dimensions
How to Dynamically Format Numbers
How to Change Date Aggregation Using Parameters
How to Equalize Year-Over-Year Dates
How to Filter Out Partial Time Periods
How to Compare Two Date Ranges on One Axis
How to Compare Unequal Date Ranges on One Axis
How to Make a Cluster Analysis
Five Tips for Making Your Tableau Public Viz Go Viral
Tip #1: Create “Remarkable” Content
Tip #2: Balance Data and Design
Tip #3: Leverage Search Engine Optimization (SEO)
Tip #4: Network
Tip #5: Use Reddit
Three Ways to Make Beautiful Bar Charts in Tableau
Approach #1: Use Formatting Available in Tableau
Approach #2: Use Axis Rulers to Add a Baseline
Approach #3: Add Caps to Bars
Three Ways to Make Lovely Line Graphs in Tableau
Approach #1: Use Formatting Available in Tableau
Approach #2: Maximize the Data-Ink Ratio
Approach #3: Leverage the Dual-Axis
Three Ways Psychological Schemas Can Improve Your Data Visualization
Schema #1: Spatial Context
Schema #2: Icons/Shapes/Symbols
Schema #3: Color

Framework

Introducing the INSIGHT Framework for Data Visualization
Identify the Business Question
Name KPIs
Shape the Data
Shaping Data for Use with Tableau
Joining and Aggregating Data
Laying Out Data for Specific Analyses
Shaping Data for the Iron Viz Example
Initial Concept
Gather Feedback
Hone Dashboard
Tell the Story

Storytelling

Introduction to Storytelling
A Data Visualization Competition—That’s Also an Analogy for the Data Visualization Process
Tip #1: Know Your Audience
Tip #2: Smooth the Excel Transition
Tip #3: Leverage Color
Tip #4: Keep It Simple
Tip #5: Use the Golden Ratio
Tip #6: Retell an Old Story
Tip #7: Don’t Neglect the Setup
Tip #8: Don’t Use Pie Charts
Tip #9: Provide Visual Context
Tip #10: Use Callout Numbers
Tip #11: Allow Discovery
Tip #12: Balance Data and Design
Tip #13: Eliminate Chartjunk (But Not Graphics)
Tip #14: Use Freeform Dashboard Design
Tip 15: Tell a Story

Critique du livre par la rédaction Thibaut Cuvelier le 29 mars 2020

À l'époque où les données sont reines, l'art des visualisations devient de plus en plus important, tant pour comprendre ces données que pour communiquer des faits significatifs. Ce livre se focalise sur ce deuxième axe, c'est-à-dire sur la création de graphiques efficaces et plaisants à l'œil. L'auteur donne aussi des conseils pour bien intégrer ces graphiques dans une narration contextualisée des données, notamment grâce à sa méthodologie INSIGHT (qui n'est en rien spécifique au logiciel Tableau). Le mot d'ordre principal est qu'un graphique se pense : on ne peut pas réaliser de bonnes visualisations en fonçant tête baissée.

Le livre commence par une présentation générale de l'outil Tableau (déjà quelque peu dépassée, l'édition Personal n'existant plus), des types principaux de graphiques disponibles et de la bonne manière de les utiliser (surtout pour les plus exotiques). L'ouvrage est composé de cent chapitres très courts et indépendants, chacun tient dans la durée d'une pause café. Il peut se lire de manière linéaire ou comme une référence : chaque chapitre est écrit comme un petit tutoriel précis, focalisé sur une fonctionnalité du logiciel. L'auteur offre régulièrement un retour personnel d'expérience ou inspiré de la manière dont d'autres ont conçu des visualisations efficaces. L'objectif est de montrer une bonne manière d'utiliser le logiciel, mais sûrement pas la seule. Vu que bon nombre d'utilisateurs viennent de l'environnement Excel, les différences avec Tableau sont régulièrement mises en avant, pour faciliter la prise en main.

Cependant, on peut regretter que le niveau monte très vite dans les premiers chapitres. Il vaut mieux avoir une petite expérience de Tableau pour profiter un maximum de l'ouvrage, sans quoi le vocabulaire spécifique (utilisé dès le début) pourrait être rebutant.

Les figures imprimées en couleur sont les bienvenues dans un livre sur la visualisation, bien qu'elles ne soient pas strictement nécessaires. On ne peut pas se plaindre de la quantité d'illustrations, même si certaines sont trop petites pour être lisibles sur papier.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 30/03/2020 à 0:43

Practical Tableau
100 Tips, Tutorials, and Strategies from a Tableau Zen Master

Whether you have some experience with Tableau software or are just getting started, this manual goes beyond the basics to help you build compelling, interactive data visualization applications. Author Ryan Sleeper, one of the world’s most qualified Tableau consultants, complements his web posts and instructional videos with this guide to give you a firm understanding of how to use Tableau to find valuable insights in data.

Over five sections, Sleeper—recognized as a Tableau Zen Master, Tableau Public Visualization of the Year author, and Tableau Iron Viz Champion—provides visualization tips, tutorials, and strategies to help you avoid the pitfalls and take your Tableau knowledge to the next level.

Practical Tableau sections include:

Fundamentals: get started with Tableau from the beginning
Chart types: use step-by-step tutorials to build a variety of charts in Tableau
Tips and tricks: learn innovative uses of parameters, color theory, how to make your Tableau workbooks run efficiently, and more
Framework: explore the INSIGHT framework, a proprietary process for building Tableau dashboards
Storytelling: learn tangible tactics for storytelling with data, including specific and actionable tips you can implement immediately

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre Data Science from Scratch

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Data Science from Scratch

First Principles with Python

de Joel Grus

Public visé : Débutant

Résumé de l'éditeur

To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, and toolkits—but also understand the ideas and principles underlying them. Updated for Python 3.6, this second edition of Data Science from Scratch shows you how these tools and algorithms work by implementing them from scratch.

If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with the hacking skills you need to get started as a data scientist. Packed with new material on deep learning, statistics, and natural language processing, this updated book shows you how to find the gems in today’s messy glut of data.

Get a crash course in Python
Learn the basics of linear algebra, statistics, and probability—and how and when they’re used in data science
Collect, explore, clean, munge, and manipulate data
Dive into the fundamentals of machine learning
Implement models such as k-nearest neighbors, Naïve Bayes, linear and logistic regression, decision trees, neural networks, and clustering
Explore recommender systems, natural language processing, network analysis, MapReduce, and databases

Édition : O'Reilly - 406 pages, 2^e édition, 16 mai 2019

ISBN10 : 1492041130 - ISBN13 : 9781492041139

Commandez sur www.amazon.fr :

40.74 € TTC (prix éditeur 40.74 € TTC)

Preface to the Second Edition
Preface to the First Edition
Introduction
A Crash Course in Python
Visualizing Data
Linear Algebra
Statistics
Probability
Hypothesis and Inference
Gradient Descent
Getting Data
Working with Data
Machine Learning
k-Nearest Neighbors
Naive Bayes
Simple Linear Regression
Multiple Regression
Logistic Regression
Decision Trees
Neural Networks
Deep Learning
Clustering
Natural Language Processing
Network Analysis
Recommender Systems
Databases and SQL
MapReduce
Data Ethics
Go Forth and Do Data Science

Critique du livre par la rédaction Thibaut Cuvelier le 4 mars 2020

La science des données est un domaine tellement à la mode qu'elle attire des gens de tout horizon… et des livres de qualité variable, tentant de répondre à la demande. La plupart des ouvrages creusent profondément la théorie et les mathématiques derrière chaque algorithme (pour l'utilisation des techniques, les auteurs renvoient à la documentation, s'ils prennent cette peine) ou brossent un tableau tellement rapide qu'on ne comprend pas ce qui se passe vraiment (par contre, l'API d'une bibliothèque très spécifique est expliquée en long et en large). Ce livre cherche une troisième voie, entre une compréhension des algorithmes et une approche pragmatique : la compréhension d'un algorithme vient en l'implémentant, pour bien en saisir les tenants et les aboutissants. Cette maxime est poussée jusque dans ses retranchements, l'un des derniers chapitres présentant brièvement l'utilisation de bases de données SQL… avec une implémentation très basique d'un tel système.

Pour s'ouvrir à un public aussi large que possible, les prérequis sont très limités : des compétences en programmation (un chapitre entier est dédié au langage Python, utilisé dans tout le livre) et une aversion limitée envers les mathématiques (tous les concepts nécessaires étant rapidement expliqués, en prenant surtout un point de vue algorithmique). De là, l'auteur présente tout un cycle de projet de science des données, depuis l'acquisition et le nettoyage des données jusqu'au déploiement d'un modèle d'apprentissage, avec des incursions du côté du partitionnement des données ou de l'analyse de graphes.

Même si les algorithmes sont implémentés de zéro, l'auteur n'oublie pas le lien avec la pratique : presque personne n'implémente lui-même des algorithmes classiques. Ainsi, on a droit à des liens vers scikit-learn ou d'autres bibliothèques Python, selon les cas. Les fonctions développées ont une interface assez proche de celles proposées par ces bibliothèques, mais la ressemblance est la plus frappante pour les réseaux neuronaux (l'interface étant très proche de la classe Sequential de Keras). La grande différence avec les bibliothèques courantes réside dans les structures de données : pour avoir un code aussi explicite et clair que possible, l'auteur utilise des structures de données très précises telles que des listes de tuples nommés ou des dictionnaires, là où en pratique il faut utiliser des matrices pour des raisons de performance.

Si l'essentiel du livre s'intéresse au côté algorithmique de la science des données, l'auteur met l'accent sur les statistiques et fait autant que possible des liens avec des approches plus formelles pour les raisonnements : une intuition sur les données que l'on peut acquérir à l'aide d'un algorithme n'a peut-être aucun sens statistique, auquel cas cette intuition ne vaut pas grand-chose. De même, l'interprétabilité des modèles est mise en avant.

Le code proposé est toujours propre et exploite les dernières possibilités de Python. Le style d'écriture est très informel, ce qui colle parfaitement bien au public visé. On peut cependant regretter le traitement de la visualisation, un sujet très important en science des données, mais assez rapidement brossé.

En peu de mots : le pari de l'auteur est bien tenu !

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 03/03/2020 à 5:09

Data Science from Scratch
First Principles with Python

To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, and toolkits—but also understand the ideas and principles underlying them. Updated for Python 3.6, this second edition of Data Science from Scratch shows you how these tools and algorithms work by implementing them from scratch.

If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with the hacking skills you need to get started as a data scientist. Packed with new material on deep learning, statistics, and natural language processing, this updated book shows you how to find the gems in today’s messy glut of data.

Get a crash course in Python
Learn the basics of linear algebra, statistics, and probability—and how and when they’re used in data science
Collect, explore, clean, munge, and manipulate data
Dive into the fundamentals of machine learning
Implement models such as k-nearest neighbors, Naïve Bayes, linear and logistic regression, decision trees, neural networks, and clustering
Explore recommender systems, natural language processing, network analysis, MapReduce, and databases

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre Generative Deep Learning

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Generative Deep Learning

Teaching Machines to Paint, Write, Compose, and Play

de David Foster

Public visé : Intermédiaire

Résumé de l'éditeur

Generative modeling is one of the hottest topics in AI. It's now possible to teach a machine to excel at human endeavors such as painting, writing, and composing music. With this practical book, machine-learning engineers and data scientists will discover how to re-create some of the most impressive examples of generative deep learning models, such as variational autoencoders, generative adversarial networks (GANs), encoder-decoder models, and world models.

Author David Foster demonstrates the inner workings of each technique, starting with the basics of deep learning before advancing to some of the most cutting-edge algorithms in the field. Through tips and tricks, you'll understand how to make your models learn more efficiently and become more creative.

Discover how variational autoencoders can change facial expressions in photos
Build practical GAN examples from scratch, including CycleGAN for style transfer and MuseGAN for music generation
Create recurrent generative models for text generation and learn how to improve the models using attention
Understand how generative models can help agents to accomplish tasks within a reinforcement learning setting
Explore the architecture of the Transformer (BERT, GPT-2) and image generation models such as ProGAN and StyleGAN

Édition : O'Reilly - 350 pages, 1^re édition, 12 juillet 2019

ISBN10 : 1492041947 - ISBN13 : 9781492041948

Commandez sur www.amazon.fr :

46.18 € TTC (prix éditeur 46.18 € TTC)

Preface

Introduction to Generative Deep Learning

Generative Modeling
Deep Learning
Variational Autoencoders
Generative Adversarial Networks

Teaching Machines to Paint, Write, Compose, and Play

Paint
Write
Compose
Play
The Future of Generative Modeling
Conclusion

Critique du livre par la rédaction Thibaut Cuvelier le 4 mars 2020

L'apprentissage automatique a divers domaines d'application, l'un des plus étonnants est sans doute la génération de contenu : images, textes, sons, etc. Depuis quelques années, on voit de plus en plus de démonstrations bluffantes de transfert de style d'un peintre sur des photographies modernes, de rédaction automatisée de texte, etc. Ce livre se focalise sur les principes sous-jacents à ces techniques et sur leur mise en œuvre à l'aide de réseaux neuronaux.

L'auteur sépare assez clairement les chapitres en deux parties. La première aborde essentiellement la théorie, avec les architectures principales de réseaux neuronaux appliquées aux problèmes de génération de contenu. On y découvre quelques applications basiques, mais les plus intéressantes doivent attendre la seconde moitié. Là, les schémas de base sont déclinés pour atteindre les meilleurs résultats actuels sur des tâches de génération de contenu. Il n'y a pas d'objectif d'exhaustivité l'auteur a fait un choix raisonné de techniques variées, pour montrer un maximum d'outils généralisables à d'autres tâches. Il n'hésite par ailleurs pas à donner des références dans la littérature scientifique, abondante dans le domaine, pour approfondir le sujet.

Aucun chapitre ne reste abstrait, puisque chaque technique est implémentée en Python. Il ne manque aucun détail, tout est explicite et expliqué, l'auteur y prête une attention particulière : on y retrouve les architectures de réseaux neuronaux, mais aussi les algorithmes à employer pour arriver à produire le résultat attendu. Les architectures génériques ne sont présentées qu'une seule fois, après quoi leur utilisation est cachée derrière des classes de l'auteur. Par ailleurs, tout le code est disponible sur GitHub, pour faciliter sa réutilisation.

Les applications sont variées et ne sont pas limitées à une forme artificielle d'art. En effet, l'auteur détaille aussi la génération de réponses à des questions ou encore l'inclusion d'éléments génératifs dans des techniques d'apprentissage par renforcement. Les principes sont similaires, mais on n'attend pas à les voir employés dans ce contexte.

On peut aussi apprécier l'impression en couleurs du livre, qui permet aussi la coloration syntaxique du code. Ce petit plus est bien agréable. De plus, le style de l'auteur est attrayant, très peu académique : l'ouvrage se lit très facilement.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 03/03/2020 à 5:08

Generative Deep Learning
Teaching Machines to Paint, Write, Compose, and Play

Generative modeling is one of the hottest topics in AI. It's now possible to teach a machine to excel at human endeavors such as painting, writing, and composing music. With this practical book, machine-learning engineers and data scientists will discover how to re-create some of the most impressive examples of generative deep learning models, such as variational autoencoders, generative adversarial networks (GANs), encoder-decoder models, and world models.

Author David Foster demonstrates the inner workings of each technique, starting with the basics of deep learning before advancing to some of the most cutting-edge algorithms in the field. Through tips and tricks, you'll understand how to make your models learn more efficiently and become more creative.

Discover how variational autoencoders can change facial expressions in photos
Build practical GAN examples from scratch, including CycleGAN for style transfer and MuseGAN for music generation
Create recurrent generative models for text generation and learn how to improve the models using attention
Understand how generative models can help agents to accomplish tasks within a reinforcement learning setting
Explore the architecture of the Transformer (BERT, GPT-2) and image generation models such as ProGAN and StyleGAN

[Lire la suite]

Avez-vous lu ce livre ou pensez-vous le lire ?
Souhaitez-vous ajouter une critique de ce livre sur la page de la rubrique ?
Avez-vous un commentaire à faire ?

couverture du livre Practical Time Series Analysis

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Practical Time Series Analysis

Prediction With Statistics and Machine Learning

de Aileen Nielsen

Public visé : Débutant

Résumé de l'éditeur

Time series data analysis is increasingly important due to the massive production of such data through the internet of things, the digitalization of healthcare, and the rise of smart cities. As continuous monitoring and data collection become more common, the need for competent time series analysis with both statistical and machine learning techniques will increase.

Covering innovations in time series data analysis and use cases from the real world, this practical guide will help you solve the most common data engineering and analysis challengesin time series, using both traditional statistical and modern machine learning techniques. Author Aileen Nielsen offers an accessible, well-rounded introduction to time series in both R and Python that will have data scientists, software engineers, and researchers up and running quickly.

You'll get the guidance you need to confidently:

Find and wrangle time series data
Undertake exploratory time series data analysis
Store temporal data
Simulate time series data
Generate and select features for a time series
Measure error
Forecast and classify time series with machine or deep learning
Evaluate accuracy and performance

Édition : O'Reilly - 400 pages, 1^re édition, 1^er novembre 2019

ISBN10 : 1492041653 - ISBN13 : 9781492041658

Commandez sur www.amazon.fr :

44.81 € TTC (prix éditeur 44.81 € TTC)

Preface
Time Series: An Overview and a Quick History
Finding and Wrangling Time Series Data
Exploratory Data Analysis for Time Series
Simulating Time Series Data
Storing Temporal Data
Statistical Models for Time Series
State Space Models for Time Series
Generating and Selecting Features for a Time Series
Machine Learning for Time Series
Deep Learning for Time Series
Measuring Error
Performance Considerations in Fitting and Serving Time Series Models
Healthcare Applications
Financial Applications
Time Series for Government
Time Series Packages
Forecasts About Forecasting

Critique du livre par la rédaction Thibaut Cuvelier le 18 février 2020

Les séries temporelles se présentent dans de plus en plus de situations, alors que, il y a peu, elles étaient étudiées presque uniquement dans un contexte financier. Ce livre cherche à présenter le champ des utilisations possibles de l'analyse des séries chronologiques dans divers domaines ; cependant, son auteure cherche l'exhaustivité dans les méthodes pour approcher les séries temporelles plutôt que dans leur application. Elle nous livre ici une véritable revue en règle des techniques d'analyse et de prédiction de séries chronologiques, tant les plus classiques et statistiques (ARIMA et famille) que les plus récentes (de l'apprentissage automatique jusqu'à l'apprentissage profond). De fait, les méthodes les plus simples sont parfois les plus appropriées.

Alors que la plupart des ouvrages sur le sujet requièrent de bonnes connaissances en statistiques de la part du lecteur, ce n'est pas le cas de celui-ci. Pour une plus grande ouverture, l'auteure ne suppose que des bases en programmation et en statistiques, mais c'est à peu près tout : tous les concepts nécessaires sont brièvement réexpliqués, au besoin. De fait, le parti pris est véritablement pratique. Ainsi, ceux qui souhaitent approfondir la théorie derrière certaines méthodes ne trouveront pas leur bonheur ici, les explications purement mathématiques étant réduites à leur plus simple expression ; par contre, on a droit à des explications sur les faiblesses potentielles des mécanismes d'acquisition de données et sur le nettoyage des données qui suit forcément leur récupération. Ce n'est pas pour autant un livre de recettes : l'auteure cherche toujours à détailler le pourquoi, bien plus que le comment (qui est l'apanage des documentations techniques). Il n'y a d'ailleurs que rarement un intérêt à coder soi-même des algorithmes classiques, ce point est clairement expliqué par l'auteure.

On peut néanmoins se demander pourquoi les codes sont disponibles alternativement en R et en Python : dans certaines situations, l'un des deux langages a un écosystème légèrement plus développé dans un domaine particulier, mais les justifications manquent le reste du temps. Les passages de l'un à l'autre langage sont rarement détaillés, alors qu'on aimerait parfois savoir s'il est vraiment important de s'investir dans un autre langage pour l'analyse de données. En particulier, l'auteure montre un certain nombre de visualisations, mais ne le fait qu'en R, jamais en Python…

Le livre est chargé en références, souvent vers des sites comme StackOverflow : les explications n'y sont souvent pas très détaillées (pas comme dans un article scientifique), mais raisonnablement faciles à comprendre. On pourrait juste regretter que les liens pointent plutôt vers des questions que vers des réponses. De même, les logiciels existants sont bien référencés. L'ouvrage apporte, par rapport à d'autres sources, des listes de logiciels et bibliothèques pas forcément parmi les plus connus, mais surtout parmi les plus utiles — en pratique.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 18/02/2020 à 21:55

Practical Time Series Analysis

Prediction With Statistics and Machine Learning

de Aileen Nielsen

Time series data analysis is increasingly important due to the massive production of such data through the internet of things, the digitalization of healthcare, and the rise of smart cities. As continuous monitoring and data collection become more common, the need for competent time series analysis with both statistical and machine learning techniques will increase.

Covering innovations in time series data analysis and use cases from the real world, this practical guide will help you solve the most common data engineering and analysis challengesin time series, using both traditional statistical and modern machine learning techniques. Author Aileen Nielsen offers an accessible, well-rounded introduction to time series in both R and Python that will have data scientists, software engineers, and researchers up and running quickly.

You'll get the guidance you need to confidently:

Find and wrangle time series data
Undertake exploratory time series data analysis
Store temporal data
Simulate time series data
Generate and select features for a time series
Measure error
Forecast and classify time series with machine or deep learning
Evaluate accuracy and performance

Voir les critiques

couverture du livre Hands-On Unsupervised Learning Using Python

Détails du livre

Sommaire

Critiques (1)

0 commentaire

Hands-On Unsupervised Learning Using Python

How to Build Applied Machine Learning Solutions from Unlabeled Data

de Ankur A. Patel

Public visé : Débutant

Résumé de l'éditeur

Many industry experts consider unsupervised learning the next frontier in artificial intelligence, one that may hold the key to general artificial intelligence. Since the majority of the world's data is unlabeled, conventional supervised learning cannot be applied. Unsupervised learning, on the other hand, can be applied to unlabeled datasets to discover meaningful patterns buried deep in the data, patterns that may be near impossible for humans to uncover.

Author Ankur Patel shows you how to apply unsupervised learning using two simple, production-ready Python frameworks: Scikit-learn and TensorFlow using Keras. With code and hands-on examples, data scientists will identify difficult-to-find patterns in data and gain deeper business insight, detect anomalies, perform automatic feature engineering and selection, and generate synthetic datasets. All you need is programming and some machine learning experience to get started.

Compare the strengths and weaknesses of the different machine learning approaches: supervised, unsupervised, and reinforcement learning
Set up and manage machine learning projects end-to-end
Build an anomaly detection system to catch credit card fraud
Clusters users into distinct and homogeneous groups
Perform semisupervised learning
Develop movie recommender systems using restricted Boltzmann machines
Generate synthetic images using generative adversarial networks

Édition : O'Reilly - 400 pages, 1^re édition, 18 mars 2019

ISBN10 : 1492035645 - ISBN13 : 9781492035640

Commandez sur www.amazon.fr :

43.01 € TTC (prix éditeur 43.01 € TTC)

Fundamentals of Unsupervised Learning

Unsupervised Learning in the Machine Learning Ecosystem
End-to-End Machine Learning Project

Unsupervised Learning Using Scikit-Learn

Dimensionality Reduction
Anomaly Detection
Clustering
Group Segmentation

Unsupervised Learning Using TensorFlow and Keras

Autoencoders
Hands-On Autoencoder
Semisupervised Learning

Deep Unsupervised Learning Using TensorFlow and Keras

Recommender Systems Using Restricted Boltzmann Machines
Feature Detection Using Deep Belief Networks
Generative Adversarial Networks
Time Series Clustering

Critique du livre par la rédaction Thibaut Cuvelier le 2 juillet 2019

Le titre de cet ouvrage promet une belle partie appliquée, c'est effectivement ce que l'on ressent à sa lecture : on ne compte plus les lignes de code pour bien montrer ce que l'auteur fait, notamment dans ses graphiques (le code les générant étant présent dans le livre in extenso). Tout le code est d'ailleurs écrit avec Python 3, en utilisant les dernières versions des bibliothèques, afin de rester utilisable aussi longtemps que possible. Ce côté appliqué est présent tout au long du livre, l'auteur cherche toujours à présenter une utilité aux algorithmes qu'il aborde, il ne se contente pas d'un inventaire à la Prévert, le lien avec les applications réalistes est toujours présent.

L'ouvrage est construit progressivement, avec des techniques de plus en plus avancées, en présentant d'abord brièvement les concepts théoriques (sans mathématiques, car tel n'est pas le but du livre), les algorithmes, puis en plongeant dans la pratique. Les approches sont bien souvent comparées sur un même exemple, afin d'en voir les avantages et inconvénients. Cependant, l'apprentissage non supervisé n'est vu que sous un seul angle : l'exploitation de données sans étiquettes dans l'objectif d'effectuer des prédictions, c'est-à-dire comme une approche entièrement supervisée. Ce faisant, tous les aspects d'analyse de données sont négligés : il aurait été agréable, par exemple, de voir une application de partitionnement de données pour comprendre ce qu'elles contiennent (comme déterminer, sans a priori, les différentes manières de participer à un jeu). Au contraire, dans les exemples de partitionnement, on sait d'avance le nombre de classes que l'on cherche.

Au niveau de la présentation, une grande quantité de code et parfois d'images est redondante. Dans les premiers exemples, qui montrent plusieurs algorithmes d'apprentissage supervisé, la validation croisée est présentée à chaque fois, au lieu de se focaliser sur les différences entre les algorithmes. Chaque chapitre commence par une bonne page d'importation de modules Python (y compris des modules qui ne sont pas utilisés dans ce chapitre !). Certaines parties présentent une grande quantité d'images disposées de telle sorte qu'elles prennent un maximum de place (six images de taille raisonnable présentées sur trois pages, alors qu'en les réduisant un peu on aurait pu tout faire tenir sur une seule face…). Par ailleurs, toutes les images sont en noir et blanc, mais ont été conçues en couleurs : il est souvent difficile de s'y retrouver, car l'information de couleur est très exploitée (notamment pour présenter plusieurs courbes : elles ont sûrement des couleurs très différentes, mais les niveaux de gris se ressemblent trop pour que l'on arrive à faire la distinction entre les courbes).

Le côté technique m'a vraiment déçu. Les algorithmes sont présentés très rapidement, leurs paramètres sont quelque peu vus comme des boîtes noires ou simplement ignorés : comment peut-on en comprendre l'impact sur la solution ? Le chapitre sur la détection d'anomalies n'est vu que comme une application de la réduction de dimensionnalité, on ne trouve aucune discussion des algorithmes spécifiquement prévus pour cette tâche (forêts d'isolation, SVM à une classe, etc.), ce qui est assez réducteur. On ne trouve aucune mention des plongements (comme word2vec pour la représentation de mots) dans la section sur les autoencodeurs, alors que c'en est une application très importante.

Le public ciblé semble n'avoir qu'une assez faible expérience en apprentissage automatique. Le livre sera surtout utile à ceux qui veulent une introduction rapide et pas trop poussée au domaine de l'apprentissage non supervisé, un survol du domaine en abordant toutes ses facettes principales. Ceux qui se demandent à quoi l'apprentissage non supervisé peut bien être utile seront servis, mais n'en verront pas toutes les possibilités.

Commenter Signaler un problème

dourouc05 - Responsable Qt & Livres

l 12/07/2019 à 19:00

Hands-On Unsupervised Learning Using Python

Many industry experts consider unsupervised learning the next frontier in artificial intelligence, one that may hold the key to general artificial intelligence. Since the majority of the world's data is unlabeled, conventional supervised learning cannot be applied. Unsupervised learning, on the other hand, can be applied to unlabeled datasets to discover meaningful patterns buried deep in the data, patterns that may be near impossible for humans to uncover.

Author Ankur Patel shows you how to apply unsupervised learning using two simple, production-ready Python frameworks: Scikit-learn and TensorFlow using Keras. With code and hands-on examples, data scientists will identify difficult-to-find patterns in data and gain deeper business insight, detect anomalies, perform automatic feature engineering and selection, and generate synthetic datasets. All you need is programming and some machine learning experience to get started.

Compare the strengths and weaknesses of the different machine learning approaches: supervised, unsupervised, and reinforcement learning
Set up and manage machine learning projects end-to-end
Build an anomaly detection system to catch credit card fraud
Clusters users into distinct and homogeneous groups
Perform semisupervised learning
Develop movie recommender systems using restricted Boltzmann machines
Generate synthetic images using generative adversarial networks

Voir les critiques.

couverture du livre Machine Learning for Data Streams

Détails du livre

Sommaire

Critiques (1)

1 commentaire

Machine Learning for Data Streams

With Practical Examples in MOA

de Albert Bifet, Ricard Gavaldà, Geoff Holmes et Bernhard Pfahringer

Public visé : Expert

Résumé de l'éditeur

A hands-on approach to tasks and techniques in data stream mining and real-time analytics, with examples in MOA, a popular freely available open-source software framework.

Today many information sources—including sensor networks, financial markets, social networks, and healthcare monitoring—are so-called data streams, arriving sequentially and at high speed. Analysis must take place in real time, with partial data and without the capacity to store the entire data set. This book presents algorithms and techniques used in data stream mining and real-time analytics. Taking a hands-on approach, the book demonstrates the techniques using MOA (Massive Online Analysis), a popular, freely available open-source software framework, allowing readers to try out the techniques after reading the explanations.

The book first offers a brief introduction to the topic, covering big data mining, basic methodologies for mining data streams, and a simple example of MOA. More detailed discussions follow, with chapters on sketching techniques, change, classification, ensemble methods, regression, clustering, and frequent pattern mining. Most of these chapters include exercises, an MOA-based lab session, or both. Finally, the book discusses the MOA software, covering the MOA graphical user interface, the command line, use of its API, and the development of new methods within MOA. The book will be an essential reference for readers who want to use data stream mining as a tool, researchers in innovation or data stream mining, and programmers who want to create new algorithms for MOA.

Édition : MIT Press - 288 pages, 1^re édition, 2 mars 2018

ISBN10 : 0262037793 - ISBN13 : 9780262037792

Commandez sur www.amazon.fr :

46.84 € TTC (prix éditeur 46.84 € TTC)

Introduction

Introduction
Big Data Stream Mining
Hands-on Introduction to MOA

Stream Mining

Streams and Sketches
Dealing with Change
Classification
Ensemble Methods
Regression
Clustering
Frequent Pattern Mining

The MOA Software

Introduction to MOA and Its Ecosystem
The Graphical User Interface
Using the Command Line
Using the API
Developing New Methods in MOA

Critique du livre par la rédaction Thibaut Cuvelier le 13 avril 2019

L'apprentissage automatique est un domaine aux multiples facettes. Ce livre dépoussière l'une d'entre elles qui n'est que trop peu explorée dans la littérature : l'étude des flux de données, où les algorithmes doivent effectuer des prédictions, mais surtout s'adapter en temps réel à des données disponibles au compte-gouttes (même si ce dernier peut avoir un très bon débit !). Les auteurs font la part belle aux spécificités de ce paradigme : les calculs doivent être effectués très rapidement, on n'a presque pas de temps disponible par échantillon, ni de mémoire d'ailleurs.

Structurellement, on retrouve trois parties bien distinctes :

une introduction très générale au domaine, qui montre néanmoins l'essentiel de MOA, un logiciel dédié aux tâches d'apprentissage dans les flux ;
une présentation plus détaillée des algorithmes applicables à des flux, que ce soit pour les résumer, pour en dériver des modèles de prédiction ou pour explorer les données. Cette partie devrait plaire aux étudiants, professionnels et chercheurs qui souhaitent se lancer dans le domaine, notamment avec ses nombreuses références (pour les détails de certains algorithmes moins intéressants ou trop avancés : on sent un vrai lien entre le livre et la recherche actuelle dans le domaine). Les algorithmes sont détaillés avec un certain niveau de formalisme mathématique, pour bien comprendre ce qu'ils font (et pourquoi ils garantissent une certaine approximation de la réalité) ;
finalement, un guide d'utilisation assez succinct de MOA, avec un bon nombre de captures d'écran du logiciel (imprimées en couleurs !), qui détaille les différents onglets de l'interface graphique (à l'aide de listes très descriptives, mais liées aux autres chapitres de l'ouvrage) et passe rapidement sur les interfaces en ligne de commande et de programmation (ces deux derniers chapitres sont brefs et doivent être complémentés par celui sur l'interface graphique, qui contient les éléments essentiels).

On peut néanmoins reprocher quelques références vers la suite du livre (la section 4.6.2 considère parfois le contenu de la 4.9.2 intégré, par exemple), mais aussi l'omniprésence de MOA : on a l'impression que les auteurs se sont focalisés sur les algorithmes disponibles dans cette boîte à outils, plutôt que de présenter les algorithmes les plus intéressants en général. Cette remarque est toutefois assez mineure, au vu de l'exhaustivité de MOA.

À noter : le livre est aussi disponible gratuitement au format HTML, les auteurs répondant aux commentaires qui leur sont laissés.

Commenter Signaler un problème

Malick - Community Manager

l 22/04/2019 à 0:24

Bonjour chers membres du Club,

Je vous invite à lire la critique que Dourouc05 a faite pour vous au sujet du livre :

Machine Learning for Data Streams
With Practical Examples in MOA

L'apprentissage automatique est un domaine aux multiples facettes. Ce livre dépoussière l'une d'entre elles qui n'est que trop peu explorée dans la littérature : l'étude des flux de données, où les algorithmes doivent effectuer des prédictions, mais surtout s'adapter en temps réel à des données disponibles au compte-gouttes (même si ce dernier peut avoir un très bon débit !).
Les auteurs font la part belle aux spécificités de ce paradigme : les calculs doivent être effectués très rapidement, on n'a presque pas de temps disponible par échantillon, ni de mémoire d'ailleurs..Lire la suite de la critique...

Bonne lecture

dourouc05 - Responsable Qt & Livres

l 13/04/2019 à 19:29

Machine Learning for Data Streams

With Practical Examples in MOA
de Albert Bifet, Ricard Gavaldà, Geoff Holmes et Bernhard Pfahringer

couverture du livre Natural Language Processing with Python

Détails du livre

Sommaire

Critiques (2)

7 commentaires

Natural Language Processing with Python

de Steven Bird, Ewan Klein, and Edward Loper

Public visé : Intermédiaire

Résumé de l'éditeur

This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation. With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication.

Packed with examples and exercises, Natural Language Processing with Python will help you:

Extract information from unstructured text, either to guess the topic or identify "named entities"
Analyze linguistic structure in text, including parsing and semantic analysis
Access popular linguistic databases, including WordNet and treebanks
Integrate techniques drawn from fields as diverse as linguistics and artificial intelligence

This book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful.

Édition : O'Reilly - 512 pages, 1^re édition, 7 juillet 2009

ISBN10 : 0596516495 - ISBN13 : 9780596516499

Commandez sur www.amazon.fr :

34.47 € TTC (prix éditeur 36.06 € TTC)

Chapter 1. Language Processing and Python
Chapter 2. Accessing Text Corpora and Lexical Resources
Chapter 3. Processing Raw Text
Chapter 4. Writing Structured Programs
Chapter 5. Categorizing and Tagging Words
Chapter 6. Learning to Classify Text
Chapter 7. Extracting Information from Text
Chapter 8. Analyzing Sentence Structure
Chapter 9. Building Feature-Based Grammars
Chapter 10. Analyzing the Meaning of Sentences
Chapter 11. Managing Linguistic Data

Critique du livre par la rédaction Franck Dernoncourt le 1^er février 2012

Utilisé par plus d'une centaine de cours dans le monde et disponible gratuitement en ligne à l'adresse http://www.nltk.org/book (licence CC BY-NC-ND), ce livre offre une excellente introduction au traitement automatique des langues naturelles en expliquant les théories par des exemples concrets d'implémentation. Il se veut donc une introduction pratique au domaine, par opposition à une introduction purement théorique. Chaque chapitre du livre se termine par une série d'exercices classés par ordre de difficulté, mais malheureusement non corrigés.

La particularité principale du livre est qu'il présente de nombreux exemples de code, en se basant sur la bibliothèque open-source et gratuite NLTK (http://www.nltk.org) écrite en Python par notamment les auteurs de ce livre. Très bien documentée, la bibliothèque NLTK offre de nombreuses fonctionnalités de traitement des langues (analyse lexicale, étiquetage grammatical, analyse syntaxique, etc.) tout en interfaçant aussi bien des bases de données tel WordNet que des bibliothèques et logiciels tiers tels l'étiqueteur grammatical Stanford Tagger et le prouveur automatisé Prover9. Un grand nombre de corpus est également disponible via NLTK, ce qui est très appréciable pour mettre en œuvre des processus d'entraînement ainsi que pour réaliser des tests, notamment des tests de performance. Comme le livre présente les nombreuses facettes du traitement automatique des langues naturelles, il parcourt au travers de ses exemples une grande partie des fonctionnalités de NLTK.

La limite principale de la bibliothèque NLTK est les performances de Python en termes de vitesse de calcul. L'utilisation de Python permet toutefois au lecteur de ne pas être trop gêné par la barrière du langage, Python étant à ce jour sans conteste un des langages les plus simples d'accès. Pour ceux n'ayant aucune ou peu d'expérience en Python, certaines sections du livre sont dédiées uniquement à l'explication du langage Python, ce qui permet de rendre l'ouvrage accessible à tout public.

Néanmoins, bien que donnant un aperçu excellent et concret de l'ensemble du traitement automatique des langues naturelles, le focus du livre sur les exemples en Python fait que mécaniquement le livre consacre moins de place aux considérations théoriques. En ce sens, il est un complément idéal au livre de référence Speech and Language Processing (écrit par Daniel Jurafsky et James H. Martin) dont l'approche est beaucoup plus théorique.

Critique du livre par la rédaction Julien Plu le 1^er mai 2012

Ce livre sur NLTK est réellement bien écrit, il n'est pas nécessaire d'avoir une expérience en traitement automatique du langage pour pouvoir aborder cet ouvrage, il vous apprendra tout ce dont vous avez besoin pour comprendre chaque chapitre. La seule obligation est d'avoir une connaissance du langage Python.
Les exemples sont non seulement simples, mais aussi très utiles, car ce sont des choses dont on pourrait avoir besoin dans une application. J'ai principalement aimé les chapitres sur les extractions d'entités nommées, l'apprentissage pour la création d'un classifieur et l'analyse du sens d'une phrase qui sont particulièrement bien faits et expliqués.
La seule remarque que je ferais est le manque de détails sur toutes les possibilités de création et d'utilisation d'une grammaire via les expressions régulières NLTK ou non.

Commenter Signaler un problème

Djug - Expert éminent sénior

l 06/02/2012 à 8:11

Bonjour,

La rédaction de DVP a lu pour vous l'ouvrage suivant: Natural Language Processing with Python, de Steven Bird, Ewan Klein, et Edward Loper.

Envoyé par Résumé de l'éditeur

This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation. With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication.

Packed with examples and exercises, Natural Language Processing with Python will help you:

Extract information from unstructured text, either to guess the topic or identify "named entities"
Analyze linguistic structure in text, including parsing and semantic analysis
Access popular linguistic databases, including WordNet and treebanks
Integrate techniques drawn from fields as diverse as linguistics and artificial intelligence

This book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful.

L'avez-vous lu? Comptez-vous le lire bientôt?

Quel est votre avis?

Exprimez-vous!! Votre avis nous intéresse.

Franck Dernoncourt - Membre émérite

l 06/02/2012 à 9:00

Voici une liste de définitions que j'ai trouvé intéressantes dans ce livre (les pages indiquées sont sous format n° de page du livre / n° de page de mon PDF) :

hypernym/hyponym relation, i.e., the relation between superordinate and subordinate concepts (p69 / 90)
Another rimportant way to navigate the WordNet network is from items to their components (meronyms) or to the things they are contained in (holonyms) (p710 / 91)
the same dictionary word (or lemma) (p104 / 125)
strip off any affixes, a task known as stemming. (p107 / 128)
Tokenization is the task of cutting a string into identifiable linguistic units that constitute a piece of language data (p109 / 130)
Tokenization is an instance of a more general problem of segmentation. (p112 § 133)
The %s and %d symbols are called conversion specifiers (p118 / 139)
The process of classifying words into their parts-of-speech and labeling them accord-ingly is known as part-of-speech tagging, POS tagging, or simply tagging. Parts-of-speech are also known as word classes or lexical categories. The collection of tagsused for a particular task is known as a tagset. Our emphasis in this chapter is onexploiting tags, and tagging text automatically. (p179 / 200)
As n gets larger, the specificity of the contexts increases, as does the chance that the data we wish to tag contains contexts that were not present in the training data. This is known as the sparse data problem, and is quite pervasive in NLP. As a consequence, there is a trade-off between the accuracy and the coverage of our results (and this is related to the precision/recall trade-off in information retrieval) (p205 / 226)
A convenient way to look at tagging errors is the confusion matrix. It charts expected tags (the gold standard) against actual tags gen-erated by a tagger (p207 / 228)
All languages acquire new lexical items. A list of words recently added to the Oxford Dictionary of English includes cyberslacker, fatoush, blamestorm, SARS, cantopop,bupkis, noughties, muggle, and robata. Notice that all these new words are nouns, and this is reflected in calling nouns an open class. By contrast, prepositions are regarded as a closed class. That is, there is a limited set of words belonging to the class. (p211 / 232)
Common tagsets often capture some morphosyntactic information, that is, informa-tion about the kind of morphological markings that words receive by virtue of theirsyntactic role. (p212 / 233)
Classification is the task of choosing the correct class label for a given input. (p221 / 242)
The first step in creating a classifier is deciding what features of the input are relevant,and how to encode those features. For this example, we’ll start by just looking at thefinal letter of a given name. The following feature extractor function builds a dictionary containing relevant information about a given name. (p223 / 244)
Recognizing the dialogue acts underlying the utterances in a dialogue can be an important first step in understanding the conversation. The NPS Chat Corpus, which was demonstrated in Section 2.1, consists of over 10,000 posts from instant messaging sessions. These posts have all been labeled with one of 15 dialogue act types, such as “Statement,” “Emotion,” “y/n Question,” and “Continuer.” (p235 / 256)
Recognizing textual entailment (RTE) is the task of determining whether a given piece of text T entails another text called the “hypothesis”. (p235 / 256)
A confusion matrix is a table where each cell [i,j] indicates how often label j was pre-dicted when the correct label was i. (p240 / 261)
Numeric features can be converted to binary features by binning, which replaces them with features such as “4<x<6.” (p249 / 270)
Named entities are definite noun phrases that refer to specific types of individuals, such as organizations, persons, dates, and so on. The goal of a named entity recognition (NER) system is to identify all textual men-tions of the named entities. This can be broken down into two subtasks: identifyingthe boundaries of the NE, and identifying its type. (p281 / 302)
Since our grammar licenses two trees for this sentence, the sentence is said to be structurally ambiguous. The ambiguity in question is called a prepositional phrase attachment ambiguity. (p299 / 320)
A grammar is said to be recursive if a category occurring on the left hand side of a production also appears on the righthand side of a production. (p301 / 322)
A parser processes input sentences according to the productions of a grammar, and builds one or more constituent structures that conform to the grammar. A grammar is a declarative specification of well-formedness—it is actually just a string, not a program. A parser is a procedural interpretation of the grammar. It searches through the space of trees licensed by a grammar to find one that has the required sentence alongits fringe. (p302 / 323)
Phrase structure grammar is concerned with how words and sequences of words combine to form constituents. A distinct and complementary approach, dependency grammar, focuses instead on how words relate to other words. (p310 / 331)
A dependency graph is projective if, when all the words are written in linear order, the edges can be drawn above the words without crossing. (p311 / 332)
In the tradition of dependency grammar, the verbs in Table 8-3 (whose dependents have Adj, NP, S and PP, which are often called complements of the respective verbs, are different) are said to have different valencies. (p313 / 335)
This ambiguity is unavoidable, and leads to horrendous inefficiency in parsing seemingly innocuous sentences. The solution to these problems is provided by probabilistic parsing, which allows us to rank the parses of an ambiguous sentence on the basis of evidence from corpora. (p318 / 339)
A probabilistic context-free grammar (or PCFG) is a context-free grammar that as-sociates a probability with each of its productions. It generates the same set of parses for a text that the corresponding context-free grammar does, and assigns a probability to each parse. The probability of a parse generated by a PCFG is simply the product ofthe probabilities of the productions used to generate it. (p320 / 341)
We can see that morphological properties of the verb co-vary with syntactic properties of the subject noun phrase. This co-variance is called agreement. (p329 / 350)
A feature path is a sequence of arcs that can be followed from the root node (p339 / 360)
A more general feature structure subsumes a less general one. (p341 / 362)
Merging information from two feature structures is called unification. (p342 / 363)
The two sentences in (5) can be both true, whereas those in (6) and (7) cannot be. In other words, the sentences in (5) are consistent, whereas those in (6) and (7) are inconsistent. (p365 / 386)
A model for a set W of sentences is a formal representation of a situation in which allthe sentences in W are true. (p367 / 388)
An argument is valid if there is no possible situation in which its premises are all true and its conclusion is not true. (p369 / 390)
In the sentences "Cyril is tall. He likes maths.", we say that he is coreferential with the noun phrase Cyril. (p373 / 394)
In the sentence "Angus had a dog but he disappeared.", "he" is bound by the indefinite NP "a dog", and this is a different relationship than coreference. If we replace the pronoun he by a dog, the result "Angus had a dog but a dog disappeared" is not semantically equivalent to the original sentence "Angus had a dog but he disappeared." (p374 / 395)
In general, an occurrence of a variable x in a formula F is free in F if that occurrence doesn’t fall within the scope of all x or some x in F. Conversely, if x is free in formula F, then it is bound in all x.F and exists x.F. If all variable occurrences in a formulaare bound, the formula is said to be closed. (p375 / 396)
The general process of determining truth or falsity of a formula in a model is called model checking. (p379 / 400)
Principle of Compositionality: the meaning of a whole is a function of the meaningsof the parts and of the way they are syntactically combined. (p385 / 406)
? is a binding operator, just as the first-order logic quantifiers are. (p387 / 408)
A discourse representation structure (DRS) presents the meaning of discourse in terms of a list of discourse referents and a list of conditions.The discourse referents are the things under discussion in the discourse, and they correspond to the individual variables of first-order logic. The DRS conditions apply to those discourse referents, and correspond to atomic open formulas of first-orderlogic. (p397 / 418)
Inline annotation modifies the original document by inserting special symbols or control sequences that carry the annotated information. For example, when part-of-speech tagging a document, the string "fly" might be replacedwith the string "fly/NN", to indicate that the word fly is a noun in this context. Incontrast, standoff annotation does not modify the original document, but instead creates a new file that adds annotation information using pointers that reference the original document. For example, this new document might contain the string "<token id=8pos='NN'/>", to indicate that token 8 is a noun. (p421 / 442)

Un autre dictionnaire de NLP disponible online : http://www.cse.unsw.edu.au/~billw/nlpdict.html

Franck Dernoncourt - Membre émérite

l 06/02/2012 à 20:23

Également, pour ceux intéressés par le sujet, Stanford lance un cours d'introduction au traitement automatique des langues naturelles : http://www.nlp-class.org/