
❣ Chile in a Photography ❣
Not today Justin
i don't do bad sauce passes
h
I'd rather be in outer space 🛸
DEAR READER
noise dept.
dirt enthusiast

祝日 / Permanent Vacation

Kiana Khansmith
Stranger Things
we're not kids anymore.
Jules of Nature
taylor price
trying on a metaphor
Cosmic Funnies
Cosimo Galluzzi
Monterey Bay Aquarium

tannertan36
he wasn't even looking at me and he found me

seen from South Korea

seen from Türkiye
seen from United States
seen from United States
seen from United States
seen from United States
seen from United States

seen from United States
seen from United States

seen from United States
seen from United States
seen from United States

seen from Türkiye
seen from Malaysia
seen from France
seen from Italy
seen from United States

seen from Algeria
seen from Bulgaria

seen from Germany
@lablr-blog
AlchemyApi vs Goose: plain text extraction from news article html contents
I am looking for a better html scraping that focuses on the actual content of a news article - without the noisy text around. Just the title, the excerpt and the content. Boilerpipe does this really good job. Is there a pythonic equivalent? I jumped on the new release of alchemyapi and on a really good python library, Goose. I did this test on an italian article about italian internal politics published on repubblica.it
A first test on alchemyApi - available at alchemyApi- produces this result:
TITLE: Renzi: "C'è chi scommette sulla sconfitta dell'Italia, non mandate i buffoni in Europa" - Repubblica.it LANGUAGE: Italian CONTENTS: Intervistato su RaiUno, il premier rilancia sulla necessità di abbassare le tasse alle imprese e attacca Grillo in vista del voto di domenica. Caos Expo: "Cantone non è supereroe ma neanche passatimbri, chi prende tangenti deve stare fuori dai palazzi della politica" Il premier Matteo Renzi (lapresse)ROMA - Votate chi vi pare - ha detto - ma non votate i buffoni. Poi riaffronta il nodo Expo e fa un passaggio su crisi e prelievo fiscale: "Il sistema politico può smettere di rompere le scatole" alle imprese "e abbassare le tasse".
This does not seem to respect the main content of the article (e.g. this line is out of the main focus "Il premier Matteo Renzi (lapresse)" and may break a standard text extraction analysis. The title still contains the newspaper name. However, the language recognition feature is really nice.
After installing Goose - from this github repo, voila the result
>>>from goose import Goose >>> url = 'http://www.repubblica.it/politica/2014/05/18/news/renzi_c_chi_scommette_sulla_sconfitta_dell_italia-86480071/?ref=HRER1-1' >>> g = Goose() >>> article = g.extract(url=url) >>> article.title u'Renzi: "C\'\xe8 chi scommette sulla sconfitta dell\'Italia, non mandate i buffoni in Europa"' >>> article.meta_description u'Intervistato su RaiUno, il premier rilancia sulla necessit\xe0 di abbassare le tasse alle imprese e attacca Grillo in vista del voto di domenica. Caos' >>> article.cleaned_text u'- Votate chi vi pare - ha detto - ma non votate i buffoni. Poi riaffronta il nodo Expo e fa un passaggio su crisi e prelievo fiscale: "Il sistema politico pu\xf2 smettere di rompere le scatole" alle imprese "e abbassare le tasse". [ ... ] , but Goose can extract the main content of the article even in a newspaper webpage and provides meta tag content. The <em>title</em>is the actual article title.
django: how to import settings in an app
import os import sys path = '/your/project/path' if path not in sys.path: sys.path.append(path) os.environ['DJANGO_SETTINGS_MODULE'] = 'yoursite.settings' from django.conf import settings print settings.STATIC_URL
Turtle
phpinfo via command line
execute phpinfo and save results into a text file
$ php -info -c /etc/php5/apache2/php.ini -re> info.txt
The info.txt file is saved in the same location where the php command has been called. More info:
php manual command line page
modify .htaccess to use Zend without virtualhost
Substitute the last line in your htaccess file (into the /public folder)
RewriteEngine On RewriteCond %{REQUEST_FILENAME} -s [OR] RewriteCond %{REQUEST_FILENAME} -l [OR] RewriteCond %{REQUEST_FILENAME} -d RewriteRule ^.*$ - [NC,L] # RewriteRule ^.*$ index.php [NC,L] # this line replaced RewriteRule ^.*$ /your-path/index.php [NC,L]
Eventually, you must add an alias into your apache config file matching the /public folder.
Alias /your-path /home/user/public/your-path/public Options Indexes MultiViews FollowSymLinks AllowOverride All Order allow,deny Allow from all
Http headers, no layout, no view - Zend Framework
In your controller action (or directly into init function ) type
class ApiController extends Zend_Controller_Action { public function init() { /* Initialize action controller here */ $this->_helper->layout->disableLayout(); $this->_helper->viewRenderer->setNoRender(true); /* reinitialize headers */ header('Content-type: text/plain; charset=UTF-8'); header("Expires: Mon, 26 Jul 1997 05:00:00 GMT"); header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT"); header("Cache-Control: no-store, no-cache, must-revalidate"); header("Cache-Control: post-check=0, pre-check=0", false); header("Pragma: no-cache"); // your content ...
vpnc package ubuntu troubles
Vpnc package in linux ubuntu 10.10
problem:
~$ vpnc: expected xauth packet; rejected: (ISAKMP_N_PAYLOAD_MALFORMED)(16)
solution:
~$ vpnc --natt-mode force-natt