Unsolved
This post is more than 5 years old
2 Intern
•
718 Posts
2
8493
October 9th, 2014 14:00
Ask the Expert: Store Everything, Analyze Everything, and Build What You Need with EMC Hadoop Storage Solutions
Welcome to this EMC Ask the Expert session. On this occasion we'll answer questions on EMC Big Data Solutions such as Isilon, ViPR, and the ECS Appliance.
EMC has always been about data; storage is just the means to keep, access, protect and use it. Big Data is the latest data management challenge. That’s why EMC was so excited to be at Hadoop World and showcase our storage and data management solutions for Big Data. EMC does not just tackle storage problems, we solve data management challenges. Our experts are getting ready to take all of your questions on this topic.
Here are your Subject Matter Experts:
George Hamilton is a senior product marketing manager for EMC ViPR Global Data Services and EMC Centera and Atmos object storage platforms. George has nearly 20 year of technology industry experience. Prior to joining EMC, he was an industry analyst and research director for Yankee Group covering cloud computing and services, IT infrastructure and IT management software. Connect with George on Twitter. |
Ryan M Peterson MBA is an internationally recognized industry expert with repeated success in the design, development, and delivery of ground-breaking, high performance technology solutions. He is a technology thought leader and pioneer with a “can do” attitude, who finds workable, technologically advanced solutions to complex issues. Ryan currently directs the efforts of EMC Isilon’s Solutions Architecture organization focusing on industry integration with the best of breed applications and technologies offered. He has also become recognized as a thought leader within the area of Big Data Analytics applications such as Hadoop and enjoys discussing the future of technology and its positive impact to the world. Connect with Ryan on Twitter |
This Event will take place from October 27 - November 7th, 2014.
Share this event on Twitter:
>> Join our Ask the Expert: Store Everything, Analyze Everything, and Build What You Need with EMC Hadoop Storage Solutions http://bit.ly/1CYVHrg #EMCATE <<
RobertoAraujo1
2 Intern
•
718 Posts
0
October 27th, 2014 08:00
Welcome everyone, this ATE session has began. Our experts are now ready to answer any question you post on this thread. Enjoy!
helghareeb
1 Rookie
•
19 Posts
0
October 27th, 2014 23:00
Hello Everyone,
Thanks for the opportunity. I have more than question, hope this is not a problem.
Sorry for the multiple questions. I understand I might be needing to learn more about the technologies. However, I find it an excellent chance to get replies from the experts. Replies may include references to external resources. Thanks again.
Nikschen
179 Posts
0
October 28th, 2014 09:00
Hi Haitham,
I will let the Experts reply to your questions, but let me recommend the latest October blog posts on the Isilon Community to you about OneFS and Hadoop, and Splunk as an alternative to Hadoop.
Isilon
Happy reading.
Niki
helghareeb
1 Rookie
•
19 Posts
0
October 30th, 2014 11:00
Hello Ryan Peterson
Honestly, those are the answers I was looking for. You really have no idea how much you helped me with your answer. Thank you for the thorough answer, and the effort and time you took to provide this answer. Actually, I have more questions, hope you don't mind.
Sorry for the questions. I am in academia, and I am starting my research career in "File Systems". A topic I find very useful actually. Sorry for any inconvenience and thanks for your generosity.
eghamilton
28 Posts
0
October 30th, 2014 15:00
Hi Haithem. Thanks for your question. Let me see if I can address the object portion.
Regardless of whether a file is stored as an object, file or a block, it is technically stored as a contiguous block of data on a disk. File/NAS storage and Object storage are simply abstractions above that process. As far as handling large files, that is precisely what Object storage is designed for. Rather than using a file system with a hierarchical structure, Object stores a file with both the metadata and raw data packaged together as a unique object. This object is then stamped with a unique identifier and placed in a non-hierarchical bucket. It seems as though you are referencing Content Addressed Storage (CAS). Centera is an example of CAS. With Centera, the application requests to create a new file and the app server sends the file to Centera. Centera performs the Content Address calculation using a proprietary hash and sends the address back to application. The application database stores the content address for future reference. The content address is a unique, digital fingerprint that guarantees content authenticity and immutability. When an application needs to access the file, the application only needs to know the content address. The authorization data is not stored within the object. That is governed by the application and the user's privileges at the application layer.
Other object platforms work similarly but use different methods of creating the unique object ID which is stored in an index.
As far as security, each operation is individually authenticated. So, if a user is not authenticated, they will not have permission to access a file. Again, this is done at the application and access control layer.
Access to object storage is via an API, most often a restful API such as Amazon S3,. EMC Atmos or OpenStack Swift.
For a more detailed explanation, here are a few resources for you:
ViPR Services Storage Engine Architecture White Paper
EMC Atmos Cloud Storage Architecture
JamieD73
32 Posts
0
November 5th, 2014 13:00
I understand that one of the questions consistently asked at the Hadoop World booth was "Which Hadoop distribution should I choose?"
How do our Big Data experts respond to that?
helghareeb
1 Rookie
•
19 Posts
0
November 9th, 2014 20:00
Ryan Peterson George
Thank you so much for the shared answers. You are truly the experts. I appreciate your time.
It is really difficult to find great answers online without directly talking to the experts.
Can't wait for the next "Ask the Expert" to take questions to the next level.
Just after I finish checking the useful resources you have provided.
Thanks
Haitham
eghamilton
28 Posts
0
November 10th, 2014 06:00
Which Hadoop distribution should you use? In the case of ViPR HDFS, EMC gives you the option to choose the Hadoop distribution that best fits your needs. ViPR Services is an object-based unstructured storage engine. ViPR Services supports access to the underlying data via Object APIs such as S3, OpenStack Swift and EMC Atmos. It also provides an HDFS interface to an object bucket. ViPR presents an HDFS-compatible file system. ViPR HDFS provides a client library (ViPR-HDFS Client) that is installed on all the data nodes that run MR jobs on the customer’s Hadoop cluster. As such, the customer can use the distribution of their choice.
When a task running on the datanode needs to read a file, the request will go to the ViPR-HDFS client (the customer will point to viprfs:// as their data source) and the ViPR client will communicate with the HDFS head on the ViPR data node. The ViPR client passes in a authN token that identifies the user to the HDFS Head.
The HDFS head in the ViPR Data node receives requests from the ViPR-HDFS client . The HDFS Head then verifies the user’s identity by authenticating against the KDC. Then it talks to the ViPR Services engine and the controller process running on the node to fetch the requested data once authN and authZ succeed.
Bottom line, the goal of ViPR HDFS is to extend analytic capabilities to additional data sources, for example, a large, PB-scale archive for metadata querying, etc. But you can use your existing Hadoop distribution.
eghamilton
28 Posts
0
November 10th, 2014 06:00
Thank you Haitham. We appreciate your participation. The community is only as valuable as the participants!
RobertoAraujo1
2 Intern
•
718 Posts
0
November 10th, 2014 06:00
This ATE events has ended. We would like to thank all those who participated in this discussion, but special thanks to our experts who took their time from their busy schedule to answers our user's questions.
Cheers.