Artificial Intelligence has been on the rise in the technology sector for years, and until recently was a very specialized tool that only the most advanced companies and researchers were using. Now, AI has managed to integrate its way into a more everyday application geared at making our lives, ability to work, and the capacity to analyze information far more efficient. One of the platforms that AI has taken to like a duck to water is the website experience. To better understand how AI can be used to enhance your website experience, we take a look at its two-fold impact; the customer experience is enhanced, and the company’s ability to use analytics to the fullest is pushed beyond the previously established boundaries.

The Customer Experience

So much of whether the company makes money or not through the website is determined by the customer experience. One of the largest catalysts that AI has provided to enhance this factor is voice- and visual-search based queries. In a study performed by Gartner, it is predicted that by 2021, early adopter brands that redesign their websites to support visual- and voice-search will increase their digital commerce revenue by 30 percent. A major retailer that has already taken advantage of AI and its ability to enhance the customer experience is Amazon. It has been reported that customers actually prefer searching for items directly on Amazon.com as opposed to doing a Google search. While using the Amazon App from a smart device you can easily dictate your search into the information field. If you take a critical look at the website, many of the options given to you are represented by an image or graphic. Not only is this new way of searching convenient for the average customer, but it also gives customers with disabilities the opportunity to be more independent!

But search features aren’t the only way that speech interaction is improving the customer experience, Chatbots also make a customized experience more attainable. These AI channels are becoming so advanced that it is rapidly resembling the interaction with a real human consultant. It is even theorized that these Chatbots will eventually be able to read our emotional reactions to products and situations! Imagine walking or driving up to a coffee shop and having a chat pop up on your phone letting you know what your last order was, and asking if you would like to not only order it again but also if it can charge the same mode of payment you used previously as well. You could have ordered, purchased, and picked up your drink without ever having to stand in line.

The Business Experience

The business experience has a symbiotic relationship with the customer experience. While the customer is interacting with the AI and having a convenient mode to get what they want; the business is using all the information that the AI is collecting from the customer, and using it to better the customer experience. This allows for an increase in profit through becoming smarter about their business and products or services. An example of this is tracking information like the most searched for terms on the website, the item or service that is selling the most units, or the converse, the item or service selling the least units. It notes what parts of the website are being interacted with the most, which can help simplify the user interface in the future. The metrics that the AI can generate are endless, as are the possibilities to use this information for positive change in the future.

Integrating AI into your webpage can take your online experience to a new level. Here at Komaya, we love keeping abreast of emerging technology and ways that it might help enhance the web presence of our client’s brands.

Original article published at The Market Mogul May 8, 2018.

“Artificial Intelligence (AI) is the solution. What is the problem?” This conversation is representative of the pervasiveness of AI. Unnoticed, an AI revolution is upon us. Something has changed in last half decade to make this happen.

Student Becomes Master

Cat, the lion’s mentor in Indian folklore, had hardly finished teaching her student how to hunt when the lion pounced on her. She escaped by climbing a nearby tree, a trick she had not taught the lion. Like the popularity of this creation-creator story, events of AI matching or exceeding human performance are making big news. Last year AI beat top humans in poker, a strategic thinking game, where both the cards and bluffing matter. A similar feat was achieved in 2016 when a computer dethroned Lee Sedol, the world champion of go, a complex Chinese board game. Besides these shocking successes over humans in strategy and board games, there has been a significant improvement in the technical performance metrics of the machines. This is leading to breakthroughs in autonomous vehicles, visual recognition, natural language processing, lip reading, and many other fields, signalling the development of algorithms with skills superior to humans.

In the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC), machines showcased dramatic progress, improving from an image recognition error rate of 28% in 2010 to less than 3% in 2016. The best human performance is about 5%. Deploying a new computational technique (Deep Convolutional Neural Net) in 2012 brought this marked improvement in image processing and set off an industry-wide boom. Similarly, both Microsoft and IBM increased the accuracy of ‘Speech Recognition on Switchboard’ from 85% in 2011 to 95% in 2017, equalling human performance. Advances in machine learning, pattern recognition, and a host of other technologies will enable autonomous cars without human drivers to drive on roads this year. Alphabet’s Waymo leads the pack of several self-driving car makers in the race.

Classical Versus Deep Learning AIs

The biological brain consists of two different but complementary cognitive systems, the rational (model based, programmed) and the intuitive (model-free, learning through experience). The understanding of how the mind selects between these modes and the interface between intuition and rational thought is beyond the current technology. In the classical approach, AI, and its subsets Machine Learning and Neural Networks, followed the rule-based (if-then) systems imitating the decision-making process of experts ignoring the intuitive part. For complex problems, it was difficult to encode rules and make available the requisite knowledge base. For almost sixty years classical AI had been languishing as human cognition is not based only on logic.

A new approach, based on continuous mathematics, was developed. In deep learning, the new name given to artificial neural networks, the classical logic-based method has been passed over in favour of experimental computation using continuous mathematics. This change has been possible due to the growing infrastructure of ubiquitous connectivity, exponentially increasing computing power, powerful algorithms, and big data sets.

Artificial neural networks, which mimic the biological brain, are used for mathematical modelling. The human brain is imitated using electronically simulated interconnected neurons stacked on the top of each other in layers. The hidden layers perform mathematical computations on inputs. Iterating through the data sets, the output gets generated via ‘back propagation’, using a technique called ‘gradient descent’, which changes the parameters to improve the model. Maybe stumbling on nature’s design, the process works very well perhaps because of imitating intuition as in the human brain.

Astonishing Results and Possible Futures

The results are so astounding that these cannot be explained even by creators of AI programmes. AI systems are like black boxes taking in questions on one side (“Should this autonomous vehicle accelerate or apply breaks on this yellow light? “What is the next move in this board or strategy game? “What are the objects in this image”) and giving out answers on the other side. It will be difficult to explain how the black box works, but it does work. In some situations, it will be difficult to use such an unpredictable, inscrutable, and unexplainable system. This method, using neural networks inspired by the human brain, requires tons of quality data to be useful compared to the very little data needed by humans. Another flaw in deep learning is its inflexibility in using experience learned in one case to help solve another. Humans can learn abstract concepts and apply them in different applications.

The algorithmic, or specialised, intelligence, known as narrow artificial intelligence (NAI), has existed for years and has now got some teeth. It is benefiting humankind in many ways but is also capable of causing large-scale damage to an increasingly digital and interconnected infrastructure. General artificial intelligence (GAI) is a general purpose human-level intelligence and is perhaps a decade away. While GAI can meet challenges such as climate change, disease, and other problems which humans are not able to solve, it will cause economic and cultural upheaval. GAI, undergoing recursive self-improvement, will give rise to a superintelligence, the third flavour of AI. Superintelligent AIs will radically outperform humans in every field and may pose an existential threat to humans. They could appear in the next few decades, or not at all.

Human intelligence is to be understood first before it can be created. With 20,000 AI research papers being published annually, enrolment in AI programmes and investment metrics rising northwards, humanity may be close to accidentally creating a general artificial intelligence.

Here at Komaya, we love keeping abreast of emerging technology and ways that it might help enhance the web presence of our client’s brands.

We live in an incredible age where scientific advancement is progressing with leaps and bounds heretofore unseen in the history of humanity. As new technologies are developed daily, old ones are falling by the wayside. All of this change amounts to a very real and influential force in the way that individual companies and even entire industries do business. The key is to know what the march of technological innovation means for your business. Here are some helpful ways to think through interacting with emerging technology as an entrepreneur.

Know What’s Out There

If you are a really busy business owner, it can be easy to be so focused on what you’re doing (which is, after all, working just fine) that you don’t pay attention to ways that new technology can improve your workflow or business model. Look up from your own business every once in a while, and pay attention to what’s going on in the industry around you. If your industry has a trade magazine (or a few), subscribe to them. Follow leading bloggers who are active in your field. Coming across an idea in one of these forums can plant a seed in your head that you can at least begin to consider as you think about how to improve your business.

Don’t Get Tunnel Vision

While focusing on your industry is a good start, it can be useful to pay attention to emerging technology in other business that only relate loosely (if at all) to yours. How many times have you heard something called “the Uber of” this or “the Netflix of” that?  Some of the biggest entrepreneurial breakthroughs have occurred when one bold soul looked at an idea in a different industry and said, “Hey, I bet that would work in my space too!”

Use New Tech Efficiently, not Obsessively

It can be easy to become consumed with every new technology that has anything resembling even a tangential connection to your industry. Energy spent trying to keep up with the latest trends, is energy that could be spent growing your business in other ways. That’s not to say that seeking out new technological advances that could help your business is always bad. You just need to be selective and intentional with the attention you pay to the subject, and precise and calculated with how you implement new ideas.

The Early Bird Gets the Worm, but the Second Mouse Gets the Cheese

Being the first of your competitors to embrace a new technological breakthrough can give you a competitive advantage, but that advantage doesn’t come without any risk. Think back to the days of the Blu-ray vs HD-DVD battle. HD-DVD actually hit the market first. If you had thrown all of your eggs into that basket, you’d be pretty disappointed in the way that battle turned out, with Blu-ray winning out in the long run. Every technological advancement has a chance of failing, and at the blinding pace of innovation and corresponding obsolescence, has a chance to be surpassed just as quickly as it appeared.

By keeping your finger on the pulse of technology that relates to your industry AND knowing how to use it effectively, you can separate yourself from the competition.

Thomas CasselberryOriginal article published by Information Management, April 6, 2015.

Ever wonder how to migrate from a relational database management system (RDBMS) schema structure to a big data platform (NoSQL database implementation). This article (part 2 of an overall series; Big Data: part 1) offers some practical advice for such a migration.

In the following sections you’ll find information that will show you the ins and outs of migrating from a traditional RDBMS schema structure to a big data storage structure. While most of the examples are made to be fairly generic, the final structure will be created for use in the “Wide Column / Column Family Store” variety. There are many ways of getting to the end goal. The process I illustrate is one of many, but it does work well for what I’m trying to show.

Warning! Here’s where things will get a little deep, technically speaking. Not to worry though! Just like in American football when you know you’ll be breaking through the defenses line, buckle up your chin-strap and keep going. The reality is that even if this discussion causes you to twitch a bit (indicating that you may be on the upper end of the non-technical speaking population) you’ll probably come through it just fine.

Relational to Composite Structure

For the sake of narrowing the scope of this discussion, I’m going to limit the focus to the “De-normalize and duplicate” approach. We’ll be ignoring the “Balanced” approach because it takes too much time to explain correctly. We’ll also be ignoring the “Model it as-is” approach because that’s a poor a practice and doesn’t perform well.

With our discussion scope sufficiently narrowed, we’ll start by tackling a relatively simple relational structure. The very first thing we’ll need to do is to evaluate which entities can be de-normalized to become what I call super-classes. “Super-class” is not a standard big data term. It’s my term and I find it makes things easier for the initial discussion. I’ll explain why later. Each of these super-classes will be used to help define the new composite structure (an actual Big Data term). We’ll be using the following Entity Relationship Diagram (ERD) to lay out the steps needed to identify our super-classes.

BigDataProjects_part2_1

In this example we have a relational structure that is used to manage information about Really Simple Syndication (RSS) feeds. Remember, there are no joins and no referential integrity in the No SQL database world. Thus, we’ll be adapting the relational structure represented in the ERD to one that matches more of a composite structure.

Finding and de-normalizing the entities in our diagram into our super-classes can be as easy as asking a few simple questions. These questions will help us identify the super-class candidates, and the de-normalization will help us create them correctly. Ready? Let’s go!

Does the entity contain foreign keys (references) to other entities?

If the answer to the foreign key question is no, we have to ask an additional clarifying question. Do other entities reference this entity? If yes, in most cases, we have a pure reference entity that will be absorbed into one or more super-classes. Reference entities become part of the super-classes that reference them (de-normalization).

If yes, this is an indicator that the entity could be either an intersection/junction or an actual super-class. To discover which, we’ll ask a clarifying question. Does the entity contain only foreign keys, foreign keys that all make up the primary key or foreign keys with an additional, separate primary key, dates or other superfluous information? A ‘no’ means we’ve identified a super-class. ‘Yes’ indicates that we have identified an intersection/junction entity. We’ll ignore any entities of this type (for now).

Let’s take a look at how this type of analysis works with our existing schema structure. Using the questions we just went through, it looks as though we have three reference entities: “categories”, “sources”, and “item_types.” We also have an intersection/junction entity: “feed_contents.” That leaves the two remaining entities: “feeds” and “contents” as our identified super-classes. That’s it, right? Not quite.

We haven’t touched on the ever-so-popular: (pause for effect…) common modeling jargon, lingo, you know – the terms you just have to know! Let’s get that taken care of now, shall we? The following table shows the common terms that we’ll need to understand to make that all important transition in our thinking when modeling from relational to non-relational.

BigDataProjects_part2_2

See! That wasn’t so bad. The key here is that unlike in the relational world we have to being to think about structures in more finite terms. A relational column is broken down into a name-value pair. Primary keys become row keys. Tables are column families and databases are key spaces. Column Families can be related to one another but not in a traditional sense so the concepts of joins and referential integrity do not translate well at all.

Before we get into the resulting structure, we still have to address a few missing parts. Namely, we need to follow through with the de-normalization of the reference entities. Let’s start with the reference table “item_types.” Its information will need to be duplicated into both the “feeds” and the “contents” super-classes. That doesn’t mean simply adding the columns from one table to another. Instead, think of it more as nesting. I’ll use braces and brackets to illustrate table boundaries, nesting depth, and column ownership as follows:

 

BigDataProjects_part2_3

You see, the “item_types” entity literally becomes part of the “feeds” entity. Using that same technique, let’s fast-forward and show how our “feeds” super-class looks after de-normalization.

BigDataProjects_part2_4

Along with the modeling terms and the de-normalization technique we just learned about, we’ll also need to know about composite structures and how they can look in Wide Column Stores. Each has their uses, pros, and cons. While I’m going to introduce these structures and their uses, I will not get into all of the pros and cons as there are currently too many discussions for, and arguments against, one structure over another to list them accurately. That said, here we go.

BigDataProjects_part2_5

The column family structure is what I would call a “class.” It’s a pretty generic structure and can be easily used to model simple entities.

BigDataProjects_part2_6

The super column family structure is derived from what I’ve been calling a “super-class.” It’s the one structure that allows us to model our de-normalized entities and accommodates elements of the column family structure also.

Let’s look at an illustration of how our “feeds” super-class converts into the “feeds” super column family.

BigDataProjects_part2_7

I know what you’re thinking: “what about that intersection/junction entity we’ve been ignoring?” You’re right. Now is definitely the right time to bring that up. How are we going to handle the many-to-many relationship? My preferred way is by modeling it explicitly.

BigDataProjects_part2_8

While this may seem like a waste of space, this model allows us to efficiently query all the contents for a specific feed as well as all feeds that have a specific content without using the actual super column families making the whole lookup process more efficient. This form of modeling can also be called: Query Pattern Modeling.

How many lookup column families and which data columns you have in those families depends what you’re intending to use in your application data-wise and how you’d like to model your query patterns. Ultimately, modeling the data structure based on the type of data you’re after is the right way to go. I will note that some structures will require much less de-normalization than ours. In our example we opted for super column families to hold the bulk of our data and did our de-normalization for that up-front.

A note on performance: Modeling the intersection/junction explicitly caused us to create two lookup column families. It also caused some obvious, additional, de-normalization for both “feeds” and “contents.” Even with the extra duplication, this method is still very reasonable to populate lookup information for both feed and content detail pages. Both will perform well with this model. Both will cause two lookups: one to query feeds and one to query contents. This will remain constant regardless of the total number of feeds and the total number of contents (a good thing).

In this article, I’ve illustrated how to define a data model for a “Wide Column / Column Family Store” NoSQL database implementation. I’ve shown how to define the model and discussed an applicable modeling technique for our narrowly defined scope.

The information presented is relevant regardless of structure modeling technique you end up using for the “Wide Column / Column Family Store” NoSQL database implementation you choose.

 

Thomas Casselberry

Original article published by Information Management, January 21, 2015.

 

So, you’ve succumbed to the buzz and now you’re looking around trying to make heads or tails of the mass amounts of information out there hyped up as “big data.” Or perhaps you’re even ready to start your own internal project to get your existing applications on the bandwagon. In either case, terrific! Your decision is a good one.

Unfortunately, now comes the flurry of potentially overwhelming questions:

  • Where do I start?
  • What are my expectations?
  • What does big data mean to my company?
  • What does big data mean in the context of our applications
  • How do I assess my application needs?
  • How do I know or determine if big data solutions will work for us?

After some online research, you’ll quickly find that most folks are merely picking a place at the edge of the pool, dipping their toes in here and there to test the water temperature.

The reality is that it is incredibly difficult to define the term big data. Its meaning includes so much more than just storing and using large data sets. When you hear people referring to big data, they’re actually referring to is the use of NoSQL database implementations to store and process large amounts of information.

Don’t be discouraged! In the following sections we’ll get into what those NoSQL databases are and how to identify which is best, if any, for your project[s]. That’s right; the goal here is to provide you with information that will allow you to draw the correct conclusions for your organization and your specific application[s] or project[s]. (In part two of this article, I explain how to migrate from relational databases to NoSQL.)

NoSQL databases aren’t really databases. In fact, they are nothing like a traditional relational database management system (RDBMS). Instead they are implementations of various data stores which do not have fixed schemas, referential integrity, defined joins, or a common storage model. Also, they typically do not adhere to ACID principles (atomicity, consistency, isolation, and durability) and have sometimes widely varied technologies behind them. The term NoSQL (or Not-only SQL) is intended to imply that many of these implementations also support SQL-like query capabilities.

In this big data market where the NoSQL database is king, there are more than 100 different offerings available in various licensed models. The fact that these non-databases vary is no accident. Each distinct implementation has different strengths, weaknesses, and generally accepted uses. However, the bulk of these break down into four major categories based on some common underlying characteristics — as shown in this chart:

Choosing the Right Path

A heavy emphasis should be placed on the definition of your requirements. What are those? Well, that’s a large discussion all by itself. However, I’ll try to quickly paraphrase for the purpose of furthering this topic of discussion: Data requirements are artifacts captured during the process of defining application behavior with respect to gathering, storing, retrieving, or displaying information (data).

For example, in your application are you processing stock quotes, working with CRM data, or processing social information? There are different needs for different application types and thus a varied number of NoSQL implementations, not all of which are designed to be applicable to your needs.

It may be that after careful evaluation, you determine that your current RDBMS approach is valid and appropriate for your application. That’s not a bad thing. Traditional RDBMS certainly has its place and will remain very relevant for business use well into the future.

You see, there’s loads of confusion about this big data thing because there is no One Path concept. It just doesn’t exist. What’s good for one business may not be good for another even though they are doing similar things. There are many factors that go into selecting the right implementation and, honestly, not everyone is careful or critical when evaluating their needs.

An Assessment

So you still think that you might be better off migrating away from RDBMS to a NoSQL implementation. This decision isn’t for the faint of heart. It requires real consideration and planning, which raises several additional questions:

Question 1: How do I know which implementation is correct for my needs?

Fortunately, there are certain high-level criteria that help us get the decision process started. One such is determined by answering this question: Is the application intensive with reads or writes (e.g. large numbers of transactions in an OLTP system or large numbers of reads in an OLAP system) today? If the answer is no, and we’re merely dealing with volumes of information, we can automatically exclude items that fall into the “Column Families/Wide Column Stores” category. There are always caveats but that’s a good, general rule of thumb. An initial litmus test if you will.

Let’s test this with an example: We’ve got a compiled application that processes transactions for a global book seller. This application sees no fewer than 70,000 transactions a minute, 24 hours per day, 7 days per week. Which category of NoSQL database implementations fits? It’s obvious, right? You bet! It’s the “Column Families/Wide Column Stores” category.

Let’s try another. In this example we’ve got a web-based application that enables title companies to enable secure signing of large numbers of documents. The transactional volume isn’t large nor are the numbers of reads. Which category fits now? Right! It’s the “Document Store” category. This is primarily because only a small amount of data is changing; the signatures and whole documents need to be stored.

I think you’ve probably got a handle on how to determine initial fits by category now. So long as we ask the right questions, with the right view of our data requirements, we should always be able to identify the right category to start trying to assess which implementations might meet our big data needs.

We’re done then, right? We can pick any random implementation from the category we’ve identified, get it installed, configured and deploy our application[s] so we can start touting our Big Data story! Hold on there, we’re not really done yet.

Question 2: What will we be using to communicate?

The language[s] your application[s] use to communicate is an important consideration when choosing the right NoSQL database implementation. In the earlier examples we were able to identify the right categories based on our knowledge of the requirements and the application usage. Now we need to narrow down the field of choices and zero in on what is likely to be right for our needs. Unfortunately we actually don’t have enough information to identify, even at a high level, which implementation matches our communication needs. Yet!

We’ll need help to answer this and other questions, so we’ll employ the use of another table. In this table we’ve laid out a few of the most popular NoSQL database implementations, their protocol[s], API[s], licenses and replication models:

BigDataProjects_2

Clear as mud? Right! Basically what this table shows is exactly what you’ll find on the web: Every NoSQL database implementation has its own way of getting data into and out of its store. Fortunately though, this is precisely what we need to match our requirements with implementations in our selected category because, in the end, our application[s] will need to know how to interact with it.

Shall we try another example?: This time, we’ll use our high-transaction application for our global book seller which we’ve already matched to NoSQL database implementations in the “Column Families / Wide Column Stores” category. Now we take a look at our application’s communication requirements. The application today communicates via an ODBC driver (a compiled binary). We could make some assumption that if we use one compiled driver, we can use any of them. That line of thinking is not un-heard of and would lead us to select Cassandra from the above tables because it resides in the right category and because of it also uses a compiled driver: its Thrift driver. A stretch? Perhaps a small one, but it’s really close enough for the purposes of our discussion.

Can it really be that easy? Well, yes and no. There are many other factors that we should consider when choosing a NoSQL database implementation. Some of these will leverage information from the table above. The additional criteria can be determined by answering the following questions:

  • Is this a commercial application? If so, is this for internal or external use? You see, there may be specific licensing restrictions that must be considered, especially in commercial, for-profit, applications.
  • Are there existing deployment restrictions such as supported operating systems or others like single vs. multi-server, replication model, or even needing to meet specific backup requirements for the company’s disaster recovery plan?

Not asking for or failing to give heed to these answers may cause your project to fail due to a poorly selected match.

I hope this has helped in some small way to eliminate some of the confusion around big data and what it might mean for your company.