Posts

Showing posts with the label Data

KDD Impact Program to support projects that have impact on society

Image
The KDD Impact Program is looking to fund projects that have potential for a significant impact on society, expand outreach of data science, and strengthen the community, with grants of $10k-$100k for each project. Submission deadline is Dec 1, 2017. Data http://ift.tt/2zdpmjO October 17, 2017 at 01:28AM

Jimdo: Data Analyst

Image
Seeking a Data Analyst you will join Jimdo's cross-functional Data Team at the Jimdo Headquarter in the beautiful city of Hamburg, and bring domain knowledge and experience to a diverse team and contribute to the understanding and usage of data throughout the company and enabling data driven decision making. Data http://ift.tt/2ysTASH October 17, 2017 at 01:22AM

Top Stories, Oct 9-15: Want to Become a Data Scientist? Read This Interview First; An Overview of 3 Popular Courses on Deep Learning

Image
Also: A Quick Guide to Fake News Detection on Social Media; How I started with learning AI in the last 2 months; Tidyverse, an opinionated Data Science Toolbox in R from Hadley Wickham; Understanding Machine Learning Algorithms; 30 Essential Data Science, Machine Learning & Deep Learning Cheat Sheets Data http://ift.tt/2ysRu5h October 17, 2017 at 01:03AM

Baseball, apple pie, and Stan

Ben sends along these two baseball job ads that mention experience with Stan as a preferred qualification: St. Louis Cardinals Baseball Development Analyst Tampa Bay Rays Baseball Research and Development Analyst The post Baseball, apple pie, and Stan appeared first on Statistical Modeling, Causal Inference, and Social Science . Data http://ift.tt/2gmwDKI October 16, 2017 at 10:00PM

Baseball, apple pie, and Stan

Ben sends along these two baseball job ads that mention experience with Stan as a preferred qualification: St. Louis Cardinals Baseball Development Analyst Tampa Bay Rays Baseball Research and Development Analyst The post Baseball, apple pie, and Stan appeared first on Statistical Modeling, Causal Inference, and Social Science . Data http://ift.tt/2gmwDKI October 16, 2017 at 10:00PM

Announcing RStudio Professional Drivers

Image
(This article was first published on RStudio Blog , and kindly contributed to R-bloggers) Today we are excited to announce the availability of RStudio Professional Drivers . There are, of course, many ways to connect to Databases using R . RStudio Professional Drivers are specifically intended for use with our professional products, including RStudio Server Pro , Shiny Server Pro , and RStudio Connect . These data connectors combined with enhancements to dplyr , the odbc package, and the RStudio IDE provide a comprehensive suite of tools for accessing and analyzing data with your enterprise systems. Connect to popular data sources RStudio Professional Drivers help you connect to some of the most popular databases. Available for download today are ODBC drivers for Microsoft SQL Server, Oracle, PostgreSQL, Apache Hive, Apache Impala, and Salesforce. We will add several more drivers over the coming months. Don’t see your database listed? Please contact our sales team to l

Predicting State Healthcare Quality – at Predictive Analytics World Healthcare – Oct 29 – Nov 2

Image
In anticipation of his upcoming conference presentation at Predictive Analytics World for Healthcare in New York, Oct 29–Nov 2, we asked Feras Batarseh, Research Assistant Professor, George Mason University a few questions about incorporating predictive analytics into healthcare. Data http://ift.tt/2yohaOx October 16, 2017 at 09:29PM

colourpicker package v1.0: You can now select semi-transparent colours in R (& more!)

Image
(This article was first published on Dean Attali's R Blog , and kindly contributed to R-bloggers) For those who aren’t familiar with the colourpicker package, it provides a colour picker for R that can be used in Shiny, as well as other related tools. Today it’s leaving behind its 0.x days and moving on to version 1.0! colourpicker has gone through a few major milestones since its inception. It began as merely a colour selector input in an unrelated package ( shinyjs ), simply because I didn’t think a colour picker input warrants its own package. After gaining a gadget and an RStudio addin (as well as some popularity!), it graduated to become its own package. Earlier this year, the Plot Helper tool was added. And now colourpicker is taking its next big step – an upgrade to version 1.0. Table of contents Due credit New feature #1: Transparent colours New feature #2: Flexible colour specification New feature #3: Type colour directly into input box Existing feat

Can we use B-splines to generate non-linear data?

Image
(This article was first published on ouR data generation , and kindly contributed to R-bloggers) I’m exploring the idea of adding a function or set of functions to the simstudy package that would make it possible to easily generate non-linear data. One way to do this would be using B-splines. Typically, one uses splines to fit a curve to data, but I thought it might be useful to switch things around a bit to use the underlying splines to generate data. This would facilitate exploring models where we know the assumption of linearity is violated. It would also make it easy to explore spline methods, because as with any other simulated data set, we would know the underlying data generating process. B-splines A B-spline is a linear combination of a set of basis functions that are determined by the number and location of specified knots or cut-points, as well as the (polynomial) degree of curvature. A degree of one implies a set of straight lines, degree of two implies a quad

Social Media and Machine Learning Transform Self-service Data Prep

Image
Social media and machine learning concepts are transforming self-service data prep into a collaborative data marketplace. Data http://ift.tt/2ifIaMc October 16, 2017 at 07:30PM

My interview with ROpenSci

Image
The ROpenSci team has started publishing a new series of interviews with the goal of “demystifying the creative and development processes of R community members”. I had the great pleasure of being interviewed by Kelly O'Briant earlier this year, and the interview was published on Friday . Thanks for being a great interviewer, Kelly! I'm looking forward to hearing from other R community members as the the rest of the series is published. ROpenSci blog: .rprofile: David Smith Data http://ift.tt/2ynTWIl October 16, 2017 at 07:21PM

Key Trends and Takeaways from RE•WORK Deep Learning Summit Montreal – Part 1: Computer Vision

Image
Read up on what you missed from the RE•WORK Deep Learning Summit Montreal, held October 10 & 11, including talks from Aaron Courville, Ira Kemelmacher-Shlizerman, Roland Memisevic, and Raquel Urtasun. Data http://ift.tt/2gKd77L October 16, 2017 at 06:29PM

How LinkedIn Makes Personalized Recommendations via Photon-ML Machine Learning tool

Image
In this article we focus on the personalization aspect of model building and explain the modeling principle as well as how to implement Photon-ML so that it can scale to hundreds of millions of users. Data http://ift.tt/2ym2T7y October 16, 2017 at 05:31PM

Big Data Gets Bigger with the iPhone and Apple Watch in Healthcare Industry

Image
Apple and IBM reached a new agreement of using big data analytics software to turn digital health to much more than a step counter. It’s the conjunction between these and others in the healthcare environment would impact each part of the health provision. The Demand for Powerful Large Data Analysis The size of the mountain of health data demands powerful big data analytics systems that are capable of crunching it, while realizing the useful insights demands the type of deep learning intelligence that the Watson by IBM provides. The new deal with Apple is a relevant component of the mission of IBM to launch worldwide health analysis cloud, Watson Health, since it enables more accurate collection of data via ResearchKit. All partners, inevitably would look for ways to prove that such gathering of data could provide useful insights for public health protection. They need the proof to provide such services in the public domain, beyond research. Regulators of the health sector would not

Freelance orphans: “33 comparisons, 4 are statistically significant: much more than the 1.65 that would be expected by chance alone, so what’s the problem??”

From someone who would prefer to remain anonymous: As you may know, the relatively recent “orphan drug” laws allow (basically) companies that can prove an off-patent drug treats an otherwise untreatable illness, to obtain intellectual property protection for otherwise generic or dead drugs. This has led to a new business of trying large numbers of combinations of otherwise-unused drugs against a large number of untreatable illnesses, with a large number of success criteria. Charcot-Marie-Tooth is a moderately rare genetic degenerative peripheral nerve disease with no known treatment. CMT causes the Schwann cells, which surround the peripheral nerves, to weaken and eventually die, leading to demyelination of the nerves, a loss of nerve conduction velocity, and an eventual loss of nerve efficacy. PXT3003 is a drug currently in Phase 2 clinical testing to treat CMT. PXT3003 consists of a mixture of low doses of baclofen (an off-patent muscle relaxant), naltrexone (an off-patent medica

Freelance orphans: “33 comparisons, 4 are statistically significant: much more than the 1.65 that would be expected by chance alone, so what’s the problem??”

From someone who would prefer to remain anonymous: As you may know, the relatively recent “orphan drug” laws allow (basically) companies that can prove an off-patent drug treats an otherwise untreatable illness, to obtain intellectual property protection for otherwise generic or dead drugs. This has led to a new business of trying large numbers of combinations of otherwise-unused drugs against a large number of untreatable illnesses, with a large number of success criteria. Charcot-Marie-Tooth is a moderately rare genetic degenerative peripheral nerve disease with no known treatment. CMT causes the Schwann cells, which surround the peripheral nerves, to weaken and eventually die, leading to demyelination of the nerves, a loss of nerve conduction velocity, and an eventual loss of nerve efficacy. PXT3003 is a drug currently in Phase 2 clinical testing to treat CMT. PXT3003 consists of a mixture of low doses of baclofen (an off-patent muscle relaxant), naltrexone (an off-patent medica

Sales Analytics: How to Use Machine Learning to Predict and Optimize Product Backorders

A Newbie’s Install of Keras & Tensorflow on Windows 10 with R

Image
(This article was first published on R – Quality and Innovation , and kindly contributed to R-bloggers) This weekend, I decided it was time: I was going to update my Python environment and get Keras and Tensorflow installed so I could start doing tutorials (particularly for deep learning) using R. Although I used to be a systems administrator (about 20 years ago), I don’t do much installing or configuring so I guess that’s why I’ve put this task off for so long. And it wasn’t unwarranted: it took me the whole weekend to get the install working. Here are the steps I used to get things running on Windows 10, leveraging clues in about 15 different online resources — and yes (I found out the hard way), the order of operations is  very important. I do not claim to have nailed  the  order of operations here, but definitely  one that works. Step 0: I had already installed the tensorflow and keras packages within R, and had been wondering why they wouldn’t work. “Of course!” I f

Why Use Docker with R? A DevOps Perspective

Image
(This article was first published on OpenCPU , and kindly contributed to R-bloggers) There have been several blog posts going around about why one would use Docker with R. In this post I’ll try to add a DevOps point of view and explain how containerizing R is used in the context of the OpenCPU system for building and deploying R servers. Has anyone in the #rstats world written really well about the *why* of their use of Docker, as opposed to the the *how*? — Jenny Bryan (@JennyBryan) September 29, 2017 1: Easy Development The flagship of the OpenCPU system is the OpenCPU server : a mature and powerful Linux stack for embedding R in systems and applications. Because OpenCPU is completely open source we can build and ship on DockerHub. A ready-to-go linux server with both OpenCPU and RStudio can be started using the following (use port 8004 or 80): docker run -t -p 8004:8004 opencpu/rstudio Now simply open http://localhost:8004/ocpu/ and http://localhost:8

Citibike Business Opportunity: Advertising

Image
(This article was first published on R – NYC Data Science Academy Blog , and kindly contributed to R-bloggers) Introduction It is hard to wander around New York City without seeing rows of dozens of bright blue Citibikes planted in the middle of busiest nooks and crannies of the city.   These bikes belong to Citibike, a ride-sharing program that allows users to conveniently rent a bike to travel to their destinations without having to worry about the hassles of parking and locking their bicycle.   Citibike has quickly become the preferred mode of transportation for many New Yorkers who are tired of the laundry list of issues with public transportation and are looking to get some fresh air as they travel around the city.     The premise for the program is quite simple, you can choose between an annual pass for year round access or a 3 or 7 day pass as a more temporary option.   Pass holders are able to pick up a bike from a station near them and ride to their destination