Saturday, December 30, 2017

Extract URL Data Like Facebook Using PHP, jQuery and Ajax

Extract URL Data Like Facebook Using PHP, jQuery and Ajax | web scraping

This post explains how to create Extract URL Data Like Facebook Using PHP, jQuery and Ajax. Basically in the URL Extraction we are retrieving the Meta Tag details from the webpages, which is available in the Head Tag.

From the Meta Tag we are picking up some important information like Title, Description, Images and some URL. If you have observed the most the company like Facebook, Twitter, Google, LinkedIn etc using this link extraction technique in their webpages. In another word this extraction technique is also called as web scraping.

What is web scraping ? 

Web Scraping refers to an application that processes the HTML of a Web page to extract data for manipulation such as converting the Web page to another format (i.e. HTML to WML).

Here in this post we have provided a simple example to Extract URL Data and also it gives you ideas how to get the cross domain data with jquery and ajax.
Extract URL Data Like Facebook Using PHP, jQuery and Ajax

lets see the below source code and clear you understanding step by step.

index.html
In this page we have created the text-box field, where we are providing the URL details for the link extraction. we have divided the extraction process in simple steps, which is as follows :
  1. When user provide any URL details in text-box field and hit the enter button, then it will send the (Ajax)request to "fetch_url.php" page with the help of jQuery script as Keyup event listener attached to text- box field.
  2. After that it will extract meta tag details from the "fetch_url.php" page without any page refresh. 
  3. manipulating the meta tag details and displaying the results HTML format.
Below code helps Facebook Like Extracting URL Data with Jquery and Ajax.
<!DOCTYPE html>
<html>
<head>
<style>
.container, #url{
width: 500px; 
border: 1px solid #d6d7da; 
padding: 0px 5px 5px 5px; 
border-radius: 5px;font-family: arial; 
color: #333333; font-size: 14px; 
background: #ffffff;rgba(200,200,200,0.7) 0 4px 10px -1px;
margin: 0px auto;
float:left;
clear: both;
margin-top:10px;;
}
</style>
<script type="text/javascript" src="jquery-3.2.1.min.js"></script>
<script type="text/javascript">
$( document ).ready(function() {
    $( "#url" ).keyup(function() {

   var val=document.getElementById("url").value;
   if(val!="" && val.indexOf("://")>-1)
   {
     $('#loading').text('Loading...');
  $('.container').hide();
     $.ajax({
          type:'post',
          url:'fetch_url.php',
          data:{
            link:val
          },
       cache: false,
       success:function(response) {
      $('#loading').text('');
   $('.container').show();
   $('.container').html(response);
      
          }
          });
    }
    });
});
</script>  
</head>
<body>  
<h1>Skptricks Extract URL Data Like Facebook Using PHP,jQuery And Ajax</h1>
<div>
  <textarea id="url" placeholder="Enter Complete URL" ></textarea>
  <div id="loading" style="clear:both;" ></div>
  <div class="container" style="display:none;"></div>
</div>

</body>
</html>

fetch_url.php
This source code helps to extract information form the webpage based on the request link. This helps to extract information like Title of Page, Page description and image from webpages.
<?php
if(isset($_POST["link"]))
{  
   $main_url=$_POST["link"];
   @$str = file_get_contents($main_url);


   // This Code Block is used to extract title
   if(strlen($str)>0)
   {
     $str = trim(preg_replace('/\s+/', ' ', $str)); // supports line breaks inside <title>
     preg_match("/\<title\>(.*)\<\/title\>/i",$str,$title);
   }
  

   // This Code block is used to extract description 
   $b =$main_url;
   @$url = parse_url( $b ) ;
   @$tags = get_meta_tags( $main_url );

   // This Code Block is used to extract og:image which facebook extracts from webpage it is also considered 
   // the default image of the webpage
   $d = new DomDocument();
   @$d->loadHTML($str);
   $xp = new domxpath($d);
   foreach ($xp->query("//meta[@property='og:image']") as $el)
   {
     $l2=parse_url($el->getAttribute("content"));
     if($l2['scheme'])
     {
    $img[]=$el->getAttribute("content");
   // print_r($img2);
     }
     else
     {
 
     }
   }
}   
?>
   <a href="<?php echo $main_url;?>" style="text-decoration: none;"  target="_blank">
   
   <?php
      if(!empty($img)) {
         echo "<img  style='max-height:100%; max-width:100%;' src='".$img[0]."'><br>";
   }  
       echo "<br><H2 id='title' >".$title[1]."</H2>";
  
       echo "<p id='desc'>".$tags['description']."</p>";
   ?>
   </a>

This is all about the Extract URL Data Like Facebook Using PHP, jQuery and Ajax tutorial. Any case of any query/suggestion please do comment below.

Video Link : 



Download Link :
https://github.com/skptricks/php-Tutorials/tree/master/Extract%20URL%20Data%20Like%20Facebook%20Using%20PHP-%20jQuery%20and%20Ajax


2 comments:

  1. It gives error when trying to extract https from localhost like this:

    file_get_contents(): SSL operation failed with code 1. OpenSSL Error messages: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure in

    ReplyDelete
  2. Check your php.ini file. If you are using shared hosting, create one first.

    allow_url_fopen=1

    ReplyDelete