Full description not available
S**N
... book could be 100 pages long if all the unnecessary log/console/etc. print-outs
This book could be 100 pages long if all the unnecessary log/console/etc. print-outs, as well as the exhaustive configuration tables that are halfway outdated by next release, were removed and pointed to online.Otherwise, a pretty good overview, a little Cloudera-centric.
M**G
Excellent book for Hadoop security
I am an enterprise security architect working on projects using the Hadoop ecosystem. This book is excellent. It gives a good overview of major topics, then dives deep.I recommend it to anyone wanting to know more about Hadoop security. This book enables one to use that knowledge to architect Hadoop based business systems that process sensitive or regulated information.
K**A
be in good condition
be in good condition
T**J
Triple J's SDSUG Review 3
"Hadoop Security: Protecting Your Big Data Platform" is an excellent, well-written book which describes the new technology, Apache Hadoop and the numerous security features within Apache Hadoop that can be implemented. This book starts with a basic history of how and why Apache Hadoop was developed and then breaks down how Apache Hadoop can be secured in three (3) sections which are: 1) Security Architecture, 2) Authentication, Authorization & Accounting (AAA) and 3) Data Security. A fourth section entitled Putting It All Together summarizes the first three sections. The best thing about this book is it does a very thorough job of not only explaining the functionality of the very complex Apache Hadoop system and all it's components but also explaining how to configure the built-in security features within these components. Actual code segments that provide more details for enhancing the security for these components makes this book an excellent reference guide for any security professional. I would highly recommend adding this book to your IT Security collection if you are facing the daunting task of securing an Apache Hadoop system.
M**N
Comprehensive Coverage Of Hadoop Security
As a follow up to my basic into to Hadoop with Hadoop 2 Quick Start Guide, I wanted to get more detail on the security features available in the Hadoop ecosystem and this sounded like it fitted the bill and was recently published (June 2015) so figured it would be pretty up to date.One thing that I immediately liked about the book is that apart from a very brief few pages of an intro to security concepts, it get straight into things, which for me is always a good indication taht there won't be much padding in the book.The book first starts off with a section on security architecture starting with a basic look threat modelling for distributed systems, which is a nice touch as really threat modelling should be part of any security architecture discussion and even touching on this at a high level is great, as is puts the whole book in context.The next chapter moves onto general security architectures in a Hadoop environment, covering network level segregation, OS level security and an overview of the different types of Hadoop node roles. This was a great start to the book as immediately it starts working through the different nodes, what user roles need access to them, what nodes can be segregated from direct access and how at a high level they interact for data loads and job submission.The final chapter of the architecture section finishes up with an overview of Kerberos, which while initially seemed a bit strange, it becomes obvious why later on as Kerberos plays such a key role in Hadoop security. If you need to get up to speed quickly on Kerberos, I’d highly recommend Kerberos: A Network Authentication System… it’s a quick and easy read that I read over ten years ago and it’s still as good now as it was then.The next section deep dives more into authentication and at this point the book gets straight into the hands on configuration guide, covering detailed configuration steps required to map Kerberos principles into the Hadoop world, how to map to local users, how user groups work in Hadoop and mapping to LDAP groups. The chapter then moves on to cover the various authentication protocols in use across the Hadoop ecosystem, before explaining the differences between simple and Kerberos authn and then a nice dive into token auth, including the flows of how delegation tokens are created to allow various systems to impersonate users. The chapter finishes off with a fully worked Kerberos authn configuration guide, which to be fair I skimmed over as I don’t need that level of detail at the moment.The next chapter moves onto authz covering HDFS ACLs and extended ACLs and various service level authorisations before moving on to MapReduce (1 and 2) and YARN, and Zookeeper ACLs, HBase, and Oozie. There’s a few nice worked examples here of the effects of authz restrictions and what errors users will see when their access is restricted.The book then moves on to cover Sentry, which is Apache’s attempt to centralise authz within the Hadoop ecosystem, which after reading through the previous few chapters it’s obvious it’s needed! The basic architecture on which Sentry works is covered and how it integrates with the various applications and then walks though how to configure each application to use Sentry. Again a very practical oriented approach is taken here with a lot of detail on the configuration steps.The last chapter in this section covers the logging available by default in each of the various applications and their basic config. This is a quick chapter and really just goes to show the configuration aspect, rather than any analysis approaches to the logs.The third section of the book moves onto data security, specifically to cover encryption of data in transit and at-rest, starting with great coverage of how HDFS file encryption works. What was particularly good in this chapter was the strong emphasis it places on the key management and also making the reader conscious of potential lack of encryption on temporary data such as logs. The second half of the chapter covers encryption of data in transit, mainly focusing on the configuration of SSL/TLS in the various applications in the ecosystem.The next chapter is a short one and looks at security of data as it is loaded into the Hadoop ecosystem, covering both the confidentiality and integrity of the data, but mainly focusing on confidentiality/encryption. The following chapter then covers how client access of data in the Hadoop environment can be performed securely, focusing of course on the edge nodes and how users interact with them, through command line RPC or APIs. From an architecture perspective, I found this chapter particularly helpful as it does a good job of describing the trust boundary that will exist in most deployments and how this should be architected securely.The last chapter in this section covers Cloudera Hue and to be honest I just skimmed this one as it wasn’t relevant to me.The final section of the book covers some use cases nicely, outlining scenarios with business and security requirements, before walking through how to architect and configure the right mix of controls to meet the requirements. For me I would have loved more examples here as this is more at the level I’m working at, rather than the technical configuration. But still, great to see it presented in this way.Overall, this was a great book that to be fair goes into a lot more depth in terms of technical configuration settings than I needed. This can make it a tough read if you’re just looking for the high level, however, if you’re setting up a Hadoop cluster then this should be your go-to book.However, it also works great at the level I was looking for as it's got a strong focus on architecture considerations and puts the security functionality into context rather than just explaining the feature sets available. You just may need to skim some of the more detailed sections like I did!
B**E
Big data requires a big security solution. This book shows you how.
As the recent RSA Conference, there were scores of vendors offering various endpoint solutions to protect laptops, desktop and mobile devices. These software solutions are clearly needed given the value of the data on these devices.When it comes to Hadoop, firms are storing massive amounts of data (massive as in petabytes and more); often without the same level of security they have on a laptop.In Hadoop Security: Protecting Your Big Data Platform, authors Ben Spivey and Joey Echeverria have written an invaluable reference for anyone looking to ensure their Hadoop data sets are appropriated secured. This is the type of book that you want your Hadoop administrators to have close by.Hadoop is one part of an open-source software framework for big data. The authors correctly note that one of the hardest jobs a Hadoop security administrator has is to keep track of how the many components handle security, access control and other authentication functions.With that, the book does a great job in providing the reader with a thorough overview of the various aspects of the Hadoop security framework. The authors detail how to correctly configure the security features for each of the components. The authors also provide actual code segments for each of the areas.Anyone using Hadoop should certainly make sure their staff knows about the security controls required to keep their data sets safe. For that, make sure they read Hadoop Security: Protecting Your Big Data Platform.
M**I
libro OK 5 stelle, MA lasciato FUORI dalla cassetta delle lettere!
Innanzitutto so che dovrei dire che è un buon libro , mi interessa, è completo , l'ho comprato perchè mi serve , ma non metto 5 stelle perchè ...lasciato FUORI dalla cassetta delle lettere! Sono stato a casa tutto il giorno per aspettarlo e neanche lo sforzo di citofonarmi, spedizione KO,giudizio negativo per il corriere
B**S
Five Stars
Good in-depth insight regarding the security protection mechanism deployed in Hadoop. It is a good place to start.
C**N
Exhaustif et efficace
Si le sujet du Big Data est relativement nouveau, la sécurité du Big Data l'est encore plus.De plus, la protection des données agregées dans ces plateformes est un enjeu majeur (Big Data = Big Problem).Cependant, la complexité du sujet est démultipliée par le nombre de d'applications existantes dans l'écosystème Hadoop. Pour compliquer encore la tache, les cas d'usages sont souvent mal maîtrisés lors de la mise en place de ces plateformes ce qui ne facilite pas l'identification des besoins de sécurité.Ce livre répond donc à un besoin fort d'avoir une source d'information unique, consolidée et illustrée d'exemples sur la sécurité du Big Data.Ce livre présente les différentes fonctionnalités de sécurité disponibles et activables dans les applications de l'écosystème Hadoop pour garantir : Disponibilité, Intégrité, Confidentialité et Preuve. Les risques de sécurité pesant sur ces plateformes sont également présentées.En introduction, le rôle des différentes applications Hadoop est expliqué brièvement pour se concentrer rapidement ensuite sur les aspects purement sécurité. Ces derniers sont présentés dans l'ordre logique de leur implémentation dans un cluster : authentification, habilitations, journalisation.Le dernier chapitre exposes quelques cas d'implémentation dans différents contexte d'entreprise. Ces mises en situation sont particulièrement intéressant pour faire le lien entre la théorie et la pratique.La sécurisation d'une plateforme Big Data permet de se plonger dans les protocoles déjà connus tels que Kerberos, LDAP, SSL, SAML et les principes de permissions et d'ACL du monde Unix.Pour quelqu'un souhaitant activer ces mesures de sécurité dans une distribution Cloudera - ce qui était mon cas - les exemples fournis sont purement Hadoop et ne mentionnent donc pas les outils disponibles dans Cloudera. C'est cependant un avantage car cela permet de comprendre la configuration Hadoop sous jacente dans Cloudera.Les exemples de configuration fournis sont précis, ni trop spécifiques, ni trop conceptuels et bien commentés.Le chapitrage est également assez détaillé et cela permet de retrouver facilement les informations à posteriori.La lecture de ce livre est fluide et le contenu est facile à comprendre pour un non anglophone comme moi.J'aurai aimé plus de précisions sur l'intégration avec Kerberos des applications clientes du cluster et les impacts potentiels de cette intégration. De plus aucune mention n'est faites des technologies Knox et Rhino. Ces solutions ne sont pas packagées dans Cloudera mais le titre du livre indique que c'est bien Hadoop qui est traité et pas une distribution en particulier. D'ailleurs un comparatif des fonctionnalités de sécurité disponibles dans chaque distribution Hadoop aurait été apprécié.Les schémas explicatifs ne sont pas le point fort du livre. Souvent trop simples, ils n'apportent pas une grande aide.Je recommande vivement ce livre aux experts Hadoop ou auditeurs de plateformes Big Data. Difficile de trouver gratuitement des informations aussi fiables et précises et réunies en un unique endroit. J'espère que d'autres éditions suivront car le sujet de la sécurité Big Data évolue encore très rapidement. Je recommande la version papier pour parcourir plus efficacement les nombreuses pages de listing de commande ou de configuration.
Trustpilot
1 month ago
1 day ago