Performing Data Analysis with Multiple Tools: Pandas, R and Deedle (F#/C#)



Overview:

The seminar will begin with an over view of data science and many the steps required for collecting, cleaning, organizing and deriving information out of data. The steps are common to all of the tools and will form the foundation for analyzing, comparing and utilizing the tools used in the remainder of the seminar: R, pandas and Deedle.

 

Topics covered in the foundation are:

  • Organizing information into data frames
  • Mutating data frame objects
  • Indexing data frames
  • Data alignment between data frames
  • Handling missing data
  • Joining data
  • Reading and writing data from files and databases
  • Accessing data from web services
  • Reshaping of data (ie: pivoting and melting)
  • Slicing and subsetting data
  • Grouping of data
  • Deriving aggregate results from data
  • Handling of time series information
  • Resampling time series data into other frequencies
  • Shifting time series data
  • Moving and sliding windows statistics
  • Data visualization


This seminar will then proceed into demonstrations and hands on experience using R Studio, pandas in Enthought Canopy, and Deedle in Visual Studio. Each of the topics covered in the data science overview will be demonstrated with each tool, including hands on installation and experimentation through several exercises. By the end of these three sessions (R, pandas and Deedle) the participant will be able to create their own data science environments in any of the tools and have the knowledge of how to proceed with performing the core concepts in each environment, as well as be able to determine which tools are best for their needs.

 

Why should you attend?

Data Science is becoming the must-have skill in technology and business. But do you know what it is, why it is important, and how it differs from business intelligence and big data? Do you know that there are very useful open source tools on multiple platforms that allow you to perform data science? That you can leverage your existing Python or .NET programming skills, or use other domain specific languages such as R?

 

Data science is deep knowledge discovery through the interactive exploration of data. This discipline often involves using mathematic and algorithmic techniques to solve some of the most analytically complex business problems, leveraging troves of raw information to figure out hidden insight that lies beneath the surface. It centers on evidence-based analytical rigor and building robust decision capabilities.

 

Data science matters because it enables companies to operate and strategize more intelligently. It is all about adding substantial enterprise value by learning from data.

 

Most who start to learn Data Science turn to domain specific languages such as R. If you know even a little about Python, there is a well-established Python library "pandas": The Python Data Analysis Library. And on the .NET platform, there is the open source Deedle library for exploratory data analysis in F# and C#, which was created by quants at BlueMountain Capital.

 

Areas Covered in the Session:

  • General concepts in data sciences
  • Data frames
  • Time series
  • Data alignment
  • Data mutation
  • Indexing
  • Joins
  • Slicing
  • Local and web data access
  • Time series manipulation
  • Grouping and aggregation
  • Handling missing data
  • Statistical analysis
  • Moving and sliding windows
  • Visualization

Who Will Benefit:

  • Data Analysts Wanting to Learn any of the Covered Tools
  • IT VP and Development Managers
  • Application / Software Architects
  • Application / Software Developers and Engineers
  • Individuals or organizations interested in data science and how to derive meaning from data

Agenda:

Day 1 Schedule:

 

Lecture 1:

  • A Survey of data science and tools
  • R, Pandas and Deedle

Lecture 2:

  • General concepts in data sciences
  • Data frames
  • Time series
  • Data alignment
  • Data mutation
  • Indexing
  • Joins
  • Slicing
  • Local and web data access
  • Time series manipulation
  • Grouping and aggregation
  • Handling missing data
  • Statistical analysis
  • Moving and sliding windows
  • Visualization

Lecture 3:

  • Pandas for data science
  • Data frames
  • Time series
  • Data alignment
  • Data mutation
  • Indexing
  • Joins
  • Slicing
  • Local and web data access
  • Time series manipulation
  • Grouping and aggregation
  • Handling missing data
  • Statistical analysis
  • Moving and sliding windows
  • Visualization
  • Exercises

Day 2 Schedule:

Lecture 4:

  • R for data science
  • Installing R and R Studio
  • Data frames
  • Time series
  • Data alignment
  • Data mutation
  • Indexing
  • Joins
  • Slicing
  • Local and web data access
  • Time series manipulation
  • Grouping and aggregation
  • Handling missing data
  • Statistical analysis
  • Moving and sliding windows
  • Visualization
  • Exercises

Lecture 5:

  • Data sciences with Deedle
  • Installation
  • Data frames
  • Time series
  • Data alignment
  • Data mutation
  • Indexing
  • Joins
  • Slicing
  • Local and web data access
  • Time series manipulation
  • Grouping and aggregation
  • Handling missing data
  • Statistical analysis
  • Moving and sliding windows
  • Visualization
  • Exercises

 

Speaker:

Michael Heydt

Principal Technology Manager, SunGard Global Services 

Michael Heydt is an independent consultant, educator and trainer with nearly thirty 30 years of professional software development experience, during which time he has focused on agile software design and implementation using advanced technologies in multiple verticals including media, finance, energy and healthcare. He holds a MS in Mathematics and Computer Science from Drexel University, and a Masters of Technology Management from the University of Pennsylvania School of Engineering and Wharton Business School.

 

His studies have focused on technology management, software engineering, entrepreneurship, information retrieval, data sciences, and computational finance. Since 2005 he has specialized in building energy and financial trading systems for major investment banks on Wall Street and several global energy trading companies, utilizing .NET, C#, WPF, TPL, DataFlow, Python, R, Mono, iOS, Android, and many others tools too numerous to list.

 

His current interests are creating seamless applications using desktop, mobile and wearable technologies, and which utilize high concurrency, high availability, real-time data analytics, augmented and virtual reality, cloud services, messaging, computer vision, natural user interfaces, and software defined networks. He is the author of numerous technology articles, papers and books, is a common speaker at .NET users groups and various mobile and cloud conferences, and regularly delivers webinars on advanced technologies.

 

 

Date: January 14th & 15th, 2016 and Time: 9:00 AM to 6:00 PM

 

Location: SFO, CA

Venue: DoubleTree by Hilton Hotel San Francisco Airport,   835 Airport Blvd., Burlingame CA 94010-9949

 

Price: $1,295.00 (Seminar for One Delegate)

 

Register now and save $200. (Early Bird)

Until December 20, Early Bird Price: $1,295.00 from December 21 to January 13, Regular Price: $1,495.00

 

Quick Contact:

 

NetZealous BDA as GlobalCompliancePanel

Phone: 1-800-447-9407

Fax: 302-288-6884

Email: support@globalcompliancepanel.com   

Website: http://www.globalcompliancepanel.com

Registration Link - http://bit.ly/Performing-Data-Analysis-with-Multiple-Tools

Speaker and Presenter Information

Speaker:

Michael Heydt

Principal Technology Manager, SunGard Global Services 

 

Michael Heydt is an independent consultant, educator and trainer with nearly thirty 30 years of professional software development experience, during which time he has focused on agile software design and implementation using advanced technologies in multiple verticals including media, finance, energy and healthcare. He holds a MS in Mathematics and Computer Science from Drexel University, and a Masters of Technology Management from the University of Pennsylvania School of Engineering and Wharton Business School.

 

His studies have focused on technology management, software engineering, entrepreneurship, information retrieval, data sciences, and computational finance. Since 2005 he has specialized in building energy and financial trading systems for major investment banks on Wall Street and several global energy trading companies, utilizing .NET, C#, WPF, TPL, DataFlow, Python, R, Mono, iOS, Android, and many others tools too numerous to list.

 

His current interests are creating seamless applications using desktop, mobile and wearable technologies, and which utilize high concurrency, high availability, real-time data analytics, augmented and virtual reality, cloud services, messaging, computer vision, natural user interfaces, and software defined networks. He is the author of numerous technology articles, papers and books, is a common speaker at .NET users groups and various mobile and cloud conferences, and regularly delivers webinars on advanced technologies.

Expected Number of Attendees

60

Relevant Government Agencies

Industry


This event has no exhibitor/sponsor opportunities


When
Thu-Fri, Jan 14-15, 2016, 9:00am - 6:00pm


Cost

Seminar for One Delegate:  $1495.00


Where
DoubleTree by Hilton Hotel San Francisco Airport
835 Airport Blvd., Burlingame CA 94010-9949
Burlingame, CA 94010-9949
Get directions


Website
Click here to visit event website


Organizer
NetZealous BDA as GlobalCompliancePanel


Contact Event Organizer



Return to search results