Extract useful content from a HTML page using Go PDF Print E-mail

I started to learn Go (golang) programming language. As a first problem i decided to code "Hello, world!" a tool to extract useful content from a HTML page.

How does it work.

I use the goquery package to convert a HTML page to well structured DOM tree.

Next step is to drop all nodes (tags) from a HTML page that are not useful at all. Like, script, style, head etc.

Then for each mode of a tree (starting from a top level node) i repeat a function recursively. The function detects if  a node has subnodes.

If yes, then the function checks how a text is distributed between child nodes of a node.

If a text is distributed relatively evenly, then current node is what we are looking for and a text from it is returned as a result.

If a text is gathered mainly in one of subnodes, then a function is executed for this subnode recursively.

To detect if a text is distributed evenly to get total length of all text (ignoring HTML) in a node and then calculate a Mean Deviation of lengths of a text in each subnode. Having this 2 values i am checking if a Mean Deviation is relatively small to a full text length. If yes , then the text is distributed relatively evenly.

There is the code!


package main

import (

func main() {

    url := os.Args[1]
    text, err := articletext.GetArticleTextFromUrl(url)

Last Updated on Wednesday, 06 July 2016 17:11
Implementing web site sign-up using popular social networks login in PHP PDF Print E-mail

Majority of web sites use sign-up with social networks as alternate for traditional registration/login. Buttons “Login with: Facebook, Twitter, Google,... ” are present together with a login form very often now.

Using social networks for login simplifies a user experience and saves his time.

The author used social login on different web sites as well. Based on experience it was decided to build a PHP class/package that summarise the social login process on an abstract level and allows to implement the process with minimum coding.

Social login in common

For a user a social login looks like identical process in different social networks. Usually, it looks like: a user licks on a “login with ...” button. Is redirected to a social network website ( etc.) , clicks “Allow” on that site and, finally, is redirected back to an original website where his social login is processes and a session for a user is created.

However, on a back-end there is some difference.

Some social networks APIs don't need to connect to an api endpoint before to redirect a user a login page. Other has to connect to an API to init a login process, to get some special redirect url, to check some settings etc.

Next difference is that some APIs will require saving a specific data (request token etc) between 2 steps of an auth process – get a redirect url and complete login. For others no need to save something.

All such differences must be taken into account during a social login process.

Introduction gelembjuk/auth PHP package to unify the social login process.

The package is available in Composer. It allows to login with Facebook, Twitter, Google, LinkedIn with unified process and minimum coding. Additionally, the package can be easy extended with new social networks and same process will be used.


It is just normal installation with Composer

composer require gelembjuk/auth


Last Updated on Thursday, 30 July 2015 04:38
Sharing my experience in PHP programming. Error caching, Social login, Templaters, Email formatting/sending PDF Print E-mail

After 13 years of PHP programming I decided to share some experience . In my Github profile i uploaded couple my classes/packages which can be useful for other developers.

For now I have shared 4 packages. All are also available with the Composer ( and are very easy to install (as any package available in composer listings)


gelembjuk/logger in composer, in Github.

The package contains classes for logging, error catching and display.

ErrorScreen class helps to catch different types of errors, PHP Exceptions, Warnings and Fatal errors, log error information and display correct error screen to a user depending on expected formats (html, json, xml, http).

FileLogger class based on Psr/Log helps to filter what to log. You can set On/Off different log groups. The main idea of this class is to help to debug big applications. But it would also work as a normal "everyday" logger to a file. It is used in my other packages/classes but can be replaced with any other Psr/Log logger.

ApplicationLogger trait helps to add logging functionality in different classes very easy. You don't need to care about such methods like log(),debug(), setLogger(), getLogger() etc. In your classes when this trait is used. Is really useful big big applications with many classes .


Last Updated on Wednesday, 29 July 2015 18:06
Joomla 1.7 site hacked. What to do? PDF Print E-mail

Recently my Joomla 1.7 site was hacked. I knew nothing about this. The site worked fine. But when i did some test search with Google and found that the site is found with keywords that are not used in my site. When i looked at Google cache i was impressed that there are different pages. Contents of my pages was replaced with advertising links and texts.

This was some sort of cloaking. And this meant my Joomla site is infected with something bad.

I already sow similar problem for my another Joomla 1.5 site year ago. So this time i know what to do.

Last Updated on Monday, 31 October 2011 07:42
Most quick way to access different file clouds on Linux PDF Print E-mail

I have tested a lot of tool and services to store files online. I chosen and use it now.

I found then most problem of online storage services is speed of accessing file.

If you need to get some file from your cloud you must to:

  • start browser
  • open site (ex.
  • login to site
  • open page with files
  • sometimes you must to browse folders and this is new page/data loading for each folder

What to do if you have to get 1 file very quick and with bad Internet connection?

Last Updated on Thursday, 15 April 2010 13:35
<< Start < Prev 1 2 Next > End >>

Page 1 of 2