Login Form



Syndication

feed-image


Roman Gelembjuk
Extract useful content from a HTML page using Go PDF Print E-mail

I started to learn Go (golang) programming language. As a first problem i decided to code "Hello, world!" a tool to extract useful content from a HTML page.

How does it work.

I use the goquery package to convert a HTML page to well structured DOM tree.

Next step is to drop all nodes (tags) from a HTML page that are not useful at all. Like, script, style, head etc.

Then for each mode of a tree (starting from a top level node) i repeat a function recursively. The function detects if  a node has subnodes.

If yes, then the function checks how a text is distributed between child nodes of a node.

If a text is distributed relatively evenly, then current node is what we are looking for and a text from it is returned as a result.

If a text is gathered mainly in one of subnodes, then a function is executed for this subnode recursively.

To detect if a text is distributed evenly to get total length of all text (ignoring HTML) in a node and then calculate a Mean Deviation of lengths of a text in each subnode. Having this 2 values i am checking if a Mean Deviation is relatively small to a full text length. If yes , then the text is distributed relatively evenly.

There is the code! https://github.com/Gelembjuk/articletext

Usage:

package main

import (
    "fmt"
    "os"
    "github.com/gelembjuk/articletext"
)

func main() {

    url := os.Args[1]
    text, err := articletext.GetArticleTextFromUrl(url)

    fmt.Println(text)
}
Last Updated on Wednesday, 06 July 2016 17:11
 
PDF Print E-mail

Imagine you have a databases created with MySQL (or other SQL server). You need to share this database with some community and allow write access. And because of trust problem, there can not be any “master” nodes or any special administrating roles with special permissions. Additionally, there is no 100% trust to all users who will want to use a database, however, you expect to have the DB working.

In other words, you need to create a distributed ledger. Best and most known technology to convert your local database to a distributed ledger is a blockchain.

How to do this?

How many coding will it require to join your MySQL DB with a blockchain tools?

What if your local DB already has a GUI and you want to reuse it in your distributed ledger too?

Some time ago I had this question. I didn’t find a solution that time.

And I got an idea to create the tool.

The idea

I would like to have a tool that can do all work related to replication of data between copies of same databases managed by different people. I want to work on my DApp business logic as easy as on any centralised application, like a web site or a desktop app storing a data in a local database. In my DApp I don’t want to think about how data changes are delivered to other instances of an app.

Last Updated on Friday, 07 December 2018 09:22
 
Most quick way to access different file clouds on Linux PDF Print E-mail

I have tested a lot of tool and services to store files online. I chosen SMEStorage.com and use it now.

I found then most problem of online storage services is speed of accessing file.

If you need to get some file from your cloud you must to:

  • start browser
  • open site (ex. smestorage.com)
  • login to site
  • open page with files
  • sometimes you must to browse folders and this is new page/data loading for each folder

What to do if you have to get 1 file very quick and with bad Internet connection?

Last Updated on Thursday, 15 April 2010 13:35
 
SMEStorage Joomla component important fix PDF Print E-mail

There was Local File Inclusion Vulnerability in SMEStorage Joomla component.

Please download updated version 1.1 and reinstall if you use version 1.0 .

Many thanks to  LatinHackTeam  for found this and letting know.

 
Business news sentiment analysis. First results PDF Print E-mail

I have completed first version of News Sentiment Analysis tool.

Some live results can be found at this page - Business News Sentiment Analysis

 
<< Start < Prev 1 2 3 4 5 6 Next > End >>

Page 2 of 6