logo
Product categories

EbookNice.com

Most ebook files are in PDF format, so you can easily read them using various software such as Foxit Reader or directly on the Google Chrome browser.
Some ebook files are released by publishers in other formats such as .awz, .mobi, .epub, .fb2, etc. You may need to install specific software to read these formats on mobile/PC, such as Calibre.

Please read the tutorial at this link.  https://ebooknice.com/page/post?id=faq


We offer FREE conversion to the popular formats you request; however, this may take some time. Therefore, right after payment, please email us, and we will try to provide the service as quickly as possible.


For some exceptional file formats or broken links (if any), please refrain from opening any disputes. Instead, email us first, and we will try to assist within a maximum of 6 hours.

EbookNice Team

(Ebook) Reinforcement Learning: Industrial Applications of Intelligent Agents by Phil Winder, Ph.D. ISBN 9781098114831, 1098114833

  • SKU: EBN-23398948
Zoomable Image
$ 32 $ 40 (-20%)

Status:

Available

0.0

0 reviews
Instant download (eBook) Reinforcement Learning: Industrial Applications of Intelligent Agents after payment.
Authors:Phil Winder, Ph.D.
Pages:408 pages.
Year:2020
Editon:1
Publisher:O'Reilly Media, Incorporated
Language:english
File Size:18.82 MB
Format:pdf
ISBNS:9781098114831, 1098114833
Categories: Ebooks

Product desciption

(Ebook) Reinforcement Learning: Industrial Applications of Intelligent Agents by Phil Winder, Ph.D. ISBN 9781098114831, 1098114833

Reinforcement learning (RL) is a machine learning (ML) paradigm that is capable ofoptimizing sequential decisions. RL is interesting because it mimics how we, ashumans, learn. We are instinctively capable of learning strategies that help us mastercomplex tasks like riding a bike or taking a mathematics exam. RL attempts to copythis process by interacting with the environment to learn strategies.Recently, businesses have been applying ML algorithms to make one-shot decisions.These are trained upon data to make the best decision at the time. But often, the rightdecision at the time may not be the best decision in the long term. Yes, that full tub ofice cream will make you happy in the short term, but you’ll have to do more exercisenext week. Similarly, click-bait recommendations might have the highest click-through rates, but in the long term these articles feel like a scam and hurt long-termengagement or retention.RL is exciting because it is possible to learn long-term strategies and apply them tocomplex industrial problems. Businesses and practitioners alike can use goals thatdirectly relate to the business like profit, number of users, and retention, not technicalevaluation metrics like accuracy or F1-score. Put simply, many challenges depend onsequential decision making. ML is not designed to solve these problems, RL is.ObjectiveI wrote this book because I have read about so many amazing examples of using RLto solve seemingly impossible tasks. But all of these examples were from academicresearch papers and the books I subsequently read were either targeted toward aca‐demia or were glorified code listings. Hardly any had an industrial perspective orexplained how to use RL in production settings. I knew how powerful this technologycould be, so I set out to write a book about using RL in industry.When I started writing, I wanted to concentrate on the operational aspects, but Iquickly realized that hardly anyone in industry had heard of RL, let alone running RLin production. Also, throughout my reader research, I found that many engineers anddata scientists had never even seen a lot of the underlying algorithms. So this bookmorphed into part fundamental explanation and part practical implementationadvice. My hope is that this book will inspire and encourage the use of RL in indus‐trial settings.I believe that this is the first book to discuss operational RL concerns and certainlythe only book that has combined algorithmic and operational developments into acoherent picture of the RL development process.Who Should Read This Book?The aim of this book is to promote the use of RL in production systems. If you are(now or in the future) building RL products, whether in research, development, oroperations, then this book is for you. This also means that I have tailored this bookmore toward industry than academia.Guiding Principles and StyleI decided on a few guiding principles that I thought were important for a book likethis, based upon my own experience with other books.The first is that I entirely avoid code listings. I believe that in most cases books are notan appropriate place for code listings—software engineering books are an obviousexception. This goes against conventional wisdom but personally, I’m sick of skippingover pages and pages of code. I buy books to hear the thoughts of the author, the waythey explain the concepts, the insights. Another reason for not printing code is thatmany of the implementations, especially in later chapters, are really quite complex,with a lot of optimization detail in the implementation that detracts from the mainideas that I want to teach. You would typically use a library implementation anyway.And then there are the algorithms that don’t have implementations yet because theyare too new or too complex to be merged into the standard libraries. For all these rea‐sons and more, this is not a typical “show-me-the-code” book.But don’t worry, this doesn’t mean there is no code at all. There is, but it’s in anaccompanying repository, along with lots of other practical examples, how-to guides,reviews, collections of papers, and lots more content (see “Supplementary Materials”)And what this does mean is that there is more room for insight, explanations, and,occasionally, a few bad jokes. You will walk away from reading this book appreciatingthe amount and density of the content, the breadth of coverage, and the fact that youhave not had to skip over pages of code.The second principle I had was about the math. RL is a highly mathematical topic,because it is usually much easier to explain an algorithm with a few lines of mathematics, rather than 20 lines of code. But I totally appreciate how mathematics canseem like an alien language sometimes. Like any other programming language, mathematics has its own syntax, assumed knowledge, and built-in functions that you haveto know before you can fully appreciate it.So throughout this book I don’t shy away from the mathematics, especially during theexplanations of the algorithms fundamental to RL, because they are an importantpart. However, I do try to limit the mathematics where I can and provide long explan‐ations where I can’t. I generally try to follow the notation provided by Thomas andOkal’s Markov Decision Process Notation, Version 1.1 But I often abuse the notationto make it even simpler.The third principle, which you might find different to other technical books thatfocus more on best practices and the art of engineering, relate to the fact that RLdevelopment has been driven by research, not by experience. So this book is chock-full of references to research papers. I attempt to collate and summarize all of thisresearch to provide you with a broad understanding of the state-of-the-art. I also tryto balance the depth that I go into.As a teacher, this is a really hard thing to do, because you might be an expert already,or you might be a complete novice that has just learned how to code. I can’t pleaseeveryone, but I can aim for the middle. On average, I hope you will feel that there is agood balance between giving you enough information to feel confident, but simplify‐ing enough to prevent you from being overwhelmed. If you do want to go into moredepth in particular subjects, then please refer to the research papers, references, andother academic books. If you are feeling overwhelmed, take your time, there’s norush. I’ve provided lots of links to other resources that will help you along your way.The fourth principle is that I always attempt to point out pitfalls or things that can gowrong. I have spoken to some people who take this to mean that RL isn’t ready or Idon’t believe in it; it is ready and I do believe in it. But it is vitally important to under‐stand the unknowns and the difficulties so you are not overpromising or allocatingenough time to do the work. This is certainly not “normal” software engineering. Sowherever you see “challenges” or explanations of “how to improve,” this is vital andimportant information. Failure is the best teacher.PrerequisitesThis all means that RL is quite an advanced topic, before you even get started. Toenjoy this book the most, you would benefit from some exposure to data science andmachine learning and you will need a little mathematics knowledge.But don’t worry if you don’t have this. You can always learn it later. I provide lots ofreferences and links to further reading and explain ancillary concepts where it makessense. I promise that you will still take away a huge amount of knowledge.Scope and OutlineThe scope of the book spans your journey of trying to move RL products into pro‐duction. First, you need to learn the basic framework that RL is built around. Nextyou move on to simple algorithms that exploit this framework. Then you can learnabout more and more advanced algorithms that are capable of greater feats. Then youneed to think about how to apply this knowledge to your industrial problem. Andfinally, you need to design a robust system to make it operationally viable.This is the path that the book follows and I recommend that you read it linearly, fromstart to finish. Later chapters build upon ideas in the early chapters, so you may missout on something if you skip it. However, feel free to skip to specific chapters or sec‐tions that interest you. Whenever necessary, I link back to previous sections.Here is an overview to whet your appetite:Chapter 1, Why Reinforcement Learning?The book begins with a gentle introduction into the history and background ofRL, with inspiration from other scientific disciplines to provide inspiration. Itsets the groundwork and gives you an overview of all the different types of algo‐rithms in RL.Chapter 2, Markov Decision Processes, Dynamic Programming, and Monte CarloMethodsThe hard work begins with a chapter defining the fundamental concepts in RLincluding Markov decision processes, dynamic programming, and Monte Carlomethods.Chapter 3, Temporal-Difference Learning, Q-Learning, and n-Step AlgorithmsIn this chapter you graduate to so-called value methods, which attempt to quan‐tify the value of being in a particular state, the basic algorithm that dominates allmodern RL.Chapter 4, Deep Q-NetworksMuch of the recent excitement has been due to the combination of value meth‐ods with deep learning. You will dive into this concoction and I promise you willbe surprised by the performance of these algorithms.Chapter 5, Policy Gradient MethodsNow you’ll learn about the second most popular form of RL algorithms—policygradient methods—which attempt to nudge a parameterized strategy toward bet‐ter performance. The primary benefit is that they can handle continuous actions.Chapter 6, Beyond Policy GradientsBasic policy gradient algorithms have a range of issues, but this chapter considersand fixes many of the problems that they suffer from. And the promise of off-policy training is introduced to improve efficiency.Chapter 7, Learning All Possible Policies with Entropy MethodsEntropy methods have proven to be robust and capable of learning strategies forcomplex activities such as driving cars or controlling traffic flow.Chapter 8, Improving How an Agent LearnsTaking a step back from the core RL algorithms, this chapter investigates howancillary components can help solve difficult problems. Here I focus on differentRL paradigms and alternative ways to formulate the Markov decision process.Chapter 9, Practical Reinforcement LearningThis is the first of two chapters on building production RL systems. This chapterwalks you through the process of designing and implementing industrial RLalgorithms. It describes the process, design decisions, and implementationpracticalities.Chapter 10, Operational Reinforcement LearningIf you want advice on how to run RL products in production, then this chapter isfor you. Here I delve into the architectural design that you should consider tomake your solution scale and be more robust, then detail the key aspects youneed to watch out for.Chapter 11, Conclusions and the FutureThe final chapter is not just another summary. It contains a wealth of practicaltips and tricks that you will find useful during your RL journey and presents sug‐gestions for future research.Supplementary MaterialsI have created the website https://rl-book.com to organize all of the extra materials thataccompany this book. Here you will find accompanying code, in-depth articles andworksheets, comparisons and reviews of RL technology, databases of current RL casestudies, and much more. See “Guiding Principles and Style” on page xvi to find outwhy there is no code printed in this book.The reason for creating a whole website, rather than just a code repository, wasbecause I believe that RL is more than just code. It’s a paradigm-changing way ofthinking about how decisions can have long-term effects. It’s a new set of technologyand it needs a totally different architecture. For all of these reasons and more, thissupplementary information does not fit in a repository. It doesn’t suit being printed,because it might change rapidly or is just inefficient. So I created this ecosystem that I am sure you will find valuable. Make sure you check it out and if there’s anythingmissing, let me know.
*Free conversion of into popular formats such as PDF, DOCX, DOC, AZW, EPUB, and MOBI after payment.

Related Products