tag:blogger.com,1999:blog-153138872024-02-28T10:25:21.325-08:00Free Mindthe Free Mind of pluskidpluskidhttp://www.blogger.com/profile/17997317415745134928noreply@blogger.comBlogger5125tag:blogger.com,1999:blog-15313887.post-11773669694211127932009-02-10T04:44:00.000-08:002009-02-10T04:46:06.364-08:00New BlogIt is here: <a href="http://blog.pluskid.org">http://blog.pluskid.org</a> , and the posts in the original blog has been archived here: <a href="http://lifegoo.pluskid.org">http://lifegoo.pluskid.org</a> .pluskidhttp://www.blogger.com/profile/17997317415745134928noreply@blogger.com0tag:blogger.com,1999:blog-15313887.post-79187450509988560432008-10-28T20:20:00.000-07:002008-10-28T20:21:05.421-07:00pluskid.lifegoo.com 大迁移<p>事情确实发生得有点突然,昨天晚上登上 Google Talk ,<a href="http://jack.lifegoo.com/">Jack</a> 突然告诉我托管的服务器在 11 月份就要拿回来了。那代表所有 lifegoo.com 上的应用都会下线吧。这个 blog 是去年五月份的时候我请求帮忙开通的,非常好用,真的要感谢 Jack 和 <a href="http://sishen.lifegoo.com/">sishen</a> 了!</p> <p>不过,现在得到这个消息还真有些伤心,毕竟是用了这么长时间的 blog ,incoming link 也是有不少的。但是总之还是先把数据备份下来吧。我想先用 wordpress 的方式备份一份数据库,再用普通网页抓取的方式把网站爬下来,不过好像 quark 昨天晚上已经帮我爬过了,还做成了一个 chm ,待会找他要去。</p> <p>备份下来之后还要考虑今后的去处呢。一时之间都没有想到哪个地方比较好用又稳定的。也许是该去注册一个域名了,不知道现在注册域名费用如何。可是如 果单单是为了放一个 blog 去租一个虚拟空间的话,似乎太浪费了,而且这些价格啊、在哪里租比较好啊之类的也都还很不了解。时间比较仓促,似乎现在 blogger 还没有被封掉,于是暂时用 blogger 上的那个吧: <a href="http://pluskid.blogspot.com/">http://pluskid.blogspot.com</a> ,如果我找到了新的地方,会在那里贴出来。lifegoo 这里也不清楚具体哪天会下线,所以我待会把这篇文章也贴过去。</p> <p>我想 blog 我还是希望继续写的,虽然现在频率比以前已经低了很多了吧。moonykily 曾经跟我说他觉得写技术 blog 的人都是脑子进水了,blog 本来就是用来抒发自己心情的。当然我是不会同意他的这个观点的,且不说写技术 blog 能够让知识得到分享这样的话吧,更自私的观点就是:如果你能把自己知道的东西给别人描述清楚的话,你会掌握得更加牢靠。而有些东西当你准备要写下来的时 候,你才会发现原来自己根本没有弄清楚。我写 blog 一般会花掉半天到两天不等的时间,但是大部分时候我还是觉得是有收获的。</p> <p>所以,我还是先去备份数据了,之后怎么迁移还真是个麻烦事,一时也想不清楚,不知道大家有什么好的主意没有?</p>pluskidhttp://www.blogger.com/profile/17997317415745134928noreply@blogger.com1tag:blogger.com,1999:blog-15313887.post-44669837021294830492008-03-01T18:17:00.000-08:002008-03-01T18:19:34.320-08:00Fixnum Overflow in Ruby’s Hash ImplementationThe <a href="http://pluskid.lifegoo.com/?p=286">original post</a> is on my <a href="http://pluskid.lifegoo.com/">main blog</a>.<br /><br />Ruby’s build-in <code>Hash</code> is the first-choice if you want to do searching. Using your own customized object as hash key is simple: define the following two method for your object: <ul><li><code>hash</code>: to get the hash code of the object.</li><li><code>eql?</code>: to compare whether two object are equal.</li></ul> <p>When working to improve the performance of <a href="http://rmmseg.rubyforge.org/">RMMSeg</a>, I tried to implement a <code>Substring</code> class which can hold a reference to a big chunk of text instead of doing an expensive copy. Then I implemented the <code>hash</code> and <code>eql?</code> method. The hash value calculated is identical to the related <code>String</code>, and <code>eql?</code> is properly implemented. But the whole thing seemed not working quite well.</p> <p><span id="more-286"></span>I though it’s my code’s fault because it’s the first time for me to write a C extension of Ruby. I use gdb to trace the program — it’s very hard to do, because <code>Hash</code> is a very commonly used data structure in Ruby. Many core libraries use it. <img src="http://pluskid.lifegoo.com/wp-content/plugins/smilies-themer/adiumicons/sad.png" alt=":(" class="wp-smiley" /> </p> <p>However, finally I figured it out (after a sleep) and created a small piece of code to reproduce it:</p> <div class="wp_syntax"><div class="code"><pre class="ruby"><span style="color: rgb(153, 102, 204); font-weight: bold;">class</span> MyStr<br /> <span style="color: rgb(153, 102, 204); font-weight: bold;">def</span> initialize<span style="color: rgb(0, 102, 0); font-weight: bold;">(</span>str<span style="color: rgb(0, 102, 0); font-weight: bold;">)</span><br /> <span style="color: rgb(0, 102, 255); font-weight: bold;">@str</span> = str<br /> <span style="color: rgb(153, 102, 204); font-weight: bold;">end</span><br /> <span style="color: rgb(153, 102, 204); font-weight: bold;">def</span> hash<br /> <span style="color: rgb(0, 102, 255); font-weight: bold;">@str</span>.<span style="color: rgb(153, 0, 204);">hash</span><br /> <span style="color: rgb(153, 102, 204); font-weight: bold;">end</span><br /> <span style="color: rgb(153, 102, 204); font-weight: bold;">def</span> eql?<span style="color: rgb(0, 102, 0); font-weight: bold;">(</span>o<span style="color: rgb(0, 102, 0); font-weight: bold;">)</span><br /> <span style="color: rgb(0, 102, 255); font-weight: bold;">@str</span>.<span style="color: rgb(153, 0, 204);">eql</span>?<span style="color: rgb(0, 102, 0); font-weight: bold;">(</span>o<span style="color: rgb(0, 102, 0); font-weight: bold;">)</span><br /> <span style="color: rgb(153, 102, 204); font-weight: bold;">end</span><br /><span style="color: rgb(153, 102, 204); font-weight: bold;">end</span><br /><br />s1 = <span style="color: rgb(153, 102, 0);">"foo"</span><br />s2 = <span style="color: rgb(153, 102, 0);">"This"</span><br />my_s1 = MyStr.<span style="color: rgb(153, 0, 204);">new</span><span style="color: rgb(0, 102, 0); font-weight: bold;">(</span>s1<span style="color: rgb(0, 102, 0); font-weight: bold;">)</span><br />my_s2 = MyStr.<span style="color: rgb(153, 0, 204);">new</span><span style="color: rgb(0, 102, 0); font-weight: bold;">(</span>s2<span style="color: rgb(0, 102, 0); font-weight: bold;">)</span><br />h = <span style="color: rgb(0, 102, 0); font-weight: bold;">{</span> s1 => <span style="color: rgb(153, 102, 0);">"value of foo"</span>, s2 => <span style="color: rgb(153, 102, 0);">"value of This"</span> <span style="color: rgb(0, 102, 0); font-weight: bold;">}</span><br /><br /><span style="color: rgb(204, 0, 102); font-weight: bold;">puts</span> <span style="color: rgb(153, 102, 0);">"h[my_s1] = #{h[my_s1].inspect}"</span><br /><span style="color: rgb(204, 0, 102); font-weight: bold;">puts</span> <span style="color: rgb(153, 102, 0);">"h[my_s2] = #{h[my_s2].inspect}"</span></pre></div></div> <p>The expected output should be</p> <div class="wp_syntax"><div class="code"><pre class="text">h[my_s1] = "value of foo"<br />h[my_s2] = "value of This"</pre></div></div> <p>but the actual output is</p> <div class="wp_syntax"><div class="code"><pre class="text">h[my_s1] = "value of foo"<br />h[my_s2] = nil</pre></div></div> <p>So what’s wrong with is? Why “foo” is right but “This” is wrong? Looking at the code of <code>Hash</code> in Ruby answers the question. Ruby’s treating <code>String</code> specially. When the key is a <code>String</code>, it use <code>rb_str_hash</code> directly to calculate the hash value.</p> <p><code>rb_str_hash</code> returns an <code>int</code>. But user customized objects don’t have this special treatment. The <code>hash</code> method is called in the Ruby environment returning a <code>Fixnum</code> which finally converted to an <code>int</code>.</p> <p>The problem is that the value range of a <code>Fixnum</code> is small than <code>int</code>. The calculated hash value of “This”, 2073740424, when converted to <code>Fixnum</code> and then converting back, finally becomes -73743224.</p> <p>That’s the problem. When key is a <code>String</code>, its hash is the original 2073740424. But when not, its hash becomes the weird -73743224. It’s a bug, not only with <code>String</code>, but also <code>Symbol</code> and <code>Fixnum</code>. I’ve post the bug report and a suggested patch to ruby-core ML. Hope it get fixed soon. <img src="http://pluskid.lifegoo.com/wp-content/plugins/smilies-themer/adiumicons/happy.png" alt=":)" class="wp-smiley" /> </p>pluskidhttp://www.blogger.com/profile/17997317415745134928noreply@blogger.com0tag:blogger.com,1999:blog-15313887.post-29897943614384006572008-02-26T02:39:00.000-08:002008-02-26T02:41:27.384-08:00[ANN] RMMSeg 0.1.2 ReleasedMainly performance improvement.<br /><br />rmmseg version 0.1.2<br />by pluskid<br /><a href="http://rmmseg.rubyforge.org">http://rmmseg.rubyforge.org</a><br /><br />== DESCRIPTION<br /><br />RMMSeg is an implementation of MMSEG Chinese word segmentation<br />algorithm. It is based on two variants of maximum matching<br />algorithms. Two algorithms are available for using:<br /><br />* simple algorithm that uses only forward maximum matching.<br />* complex algorithm that uses three-word chunk maximum matching and 3<br />aditonal rules to solve ambiguities.<br /><br />For more information about the algorithm, please refer to the<br />following essays:<br /><br />* <a href="http://technology.chtsai.org/mmseg/">http://technology.chtsai.org/mmseg/</a><br />* <a href="http://pluskid.lifegoo.com/?p=261">http://pluskid.lifegoo.com/?p=261</a><br /><br />== CHANGES<br /><br />* Add cache to find_match_words: performance improved.<br />* Implement Chunk as a module instead of a class: performance improved.<br />* Don’t store unnecessary data in dictionary: memory usage reduced.pluskidhttp://www.blogger.com/profile/17997317415745134928noreply@blogger.com0tag:blogger.com,1999:blog-15313887.post-33795235914743070892008-01-05T08:14:00.000-08:002008-01-05T08:20:29.700-08:00Running RSpec in Emacs<span style="font-size:78%;">This article is originally posted at my main blog (Mostly Chinese) : <a href="http://pluskid.lifegoo.com/?p=245">http://pluskid.lifegoo.com/?p=245</a><br /></span><br /><a href="http://rspec.info/">RSpec</a> is a Behaviour Driven Development framework for Ruby. It’s output format can be customized. However, the default format works well in Emacs’s <code>compilation-mode</code>. Type <code>M-x compile</code> and input <code>spec file_name_spec.rb</code>. The result will be prompted at a new buffer.<br /><br />Some useful information are colorized. You can even use your mouse to click on the failures to go directly to the line where spec fails (Of course there’re shortcuts like <code>C-`</code> available). However, we can still make it better.<br /><h3>More highlighting</h3><p>By default the cursor is at the beginning in the newly prompted buffer with the <code>spec</code> results. We want it to be at the end so that we can see how many examples failed. That’s easy, in fact, it is the default behavior before Emacs 20.3:</p> <div class="wp_syntax"><div class="code"><pre class="lisp"><span style="color: rgb(128, 128, 128); font-style: italic;">;; keep scrolling in compilation result buffer</span><br /><span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(177, 177, 0);">setq</span> compilation-scroll-output t<span style="color: rgb(102, 204, 102);">)</span></pre></div></div> <p>That’s simple and cool! But I want the number be highlighted! And <b>more</b> highlighted when the number of failures is not zero. That’s also easy, we can add some rules to achieve this:</p> <p><span id="more-245"></span></p> <div class="wp_syntax"><div class="code"><pre class="lisp"><span style="color: rgb(102, 204, 102);">(</span>add-to-<span style="color: rgb(177, 177, 0);">list</span> 'compilation-mode-font-lock-keywords<br /> '<span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(255, 0, 0);">"^<span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>([[:digit:]]+<span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>) examples?, <span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>([[:digit:]]+<span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>) failures?<span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>(?:, <span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>([[:digit:]]+<span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>) pendings?<span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>)?$"</span><br /> <span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(204, 102, 204);">0</span> '<span style="color: rgb(102, 204, 102);">(</span>face <span style="color: rgb(177, 177, 0);">nil</span> message <span style="color: rgb(177, 177, 0);">nil</span> help-echo <span style="color: rgb(177, 177, 0);">nil</span> mouse-face <span style="color: rgb(177, 177, 0);">nil</span><span style="color: rgb(102, 204, 102);">)</span> t<span style="color: rgb(102, 204, 102);">)</span><br /> <span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(204, 102, 204);">1</span> compilation-info-face<span style="color: rgb(102, 204, 102);">)</span><br /> <span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(204, 102, 204);">2</span> <span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(177, 177, 0);">if</span> <span style="color: rgb(102, 204, 102);">(</span>string= <span style="color: rgb(255, 0, 0);">"0"</span> <span style="color: rgb(102, 204, 102);">(</span>match-string <span style="color: rgb(204, 102, 204);">2</span><span style="color: rgb(102, 204, 102);">)</span><span style="color: rgb(102, 204, 102);">)</span><br /> compilation-info-face<br /> compilation-error-face<span style="color: rgb(102, 204, 102);">)</span><span style="color: rgb(102, 204, 102);">)</span><br /> <span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(204, 102, 204);">3</span> compilation-info-face t t<span style="color: rgb(102, 204, 102);">)</span><span style="color: rgb(102, 204, 102);">)</span><span style="color: rgb(102, 204, 102);">)</span></pre></div></div> <p>And here’s a screenshot:</p> <center><img src="http://pluskid.lifegoo.com/wp-content/uploads/2008/01/emacs-rspec.png" alt="emacs-rspec.png" /></center> <p>Yeah! Cool! <img src="http://pluskid.lifegoo.com/wp-content/plugins/smilies-themer/adiumicons/biggrin.png" alt=":D" class="wp-smiley" /> </p> <h3>Smart Compile</h3> <p><a href="http://www.emacswiki.org/cgi-bin/wiki?SmartCompile">smart-compile.el</a> is an extension for Emacs to guess the compilation command for different type of files. Customization is simple. Here’s my customization (I use Emacs to edit a lot of files):</p> <div class="wp_syntax"><div class="code"><pre class="lisp"><span style="color: rgb(102, 204, 102);">(</span>require 'smart-compile<span style="color: rgb(102, 204, 102);">)</span><br /><span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(177, 177, 0);">setq</span> smart-compile-alist<br /> '<span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(255, 0, 0);">"/programming/guile/.*c$"</span> . <span style="color: rgb(255, 0, 0);">"gcc -Wall %f `guile-config link` -o %n"</span><span style="color: rgb(102, 204, 102);">)</span><br /> <span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(255, 0, 0);">"<span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>.c<span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>'"</span> . <span style="color: rgb(255, 0, 0);">"gcc -Wall %f -lm -o %n"</span><span style="color: rgb(102, 204, 102);">)</span><br /> <span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(255, 0, 0);">"<span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>.[Cc]+[Pp]*<span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>'"</span> . <span style="color: rgb(255, 0, 0);">"g++ -Wall %f -lm -o %n"</span><span style="color: rgb(102, 204, 102);">)</span><br /> <span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(255, 0, 0);">"<span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>.java$"</span> . <span style="color: rgb(255, 0, 0);">"javac %f"</span><span style="color: rgb(102, 204, 102);">)</span><br /> <span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(255, 0, 0);">"_spec<span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>.rb$"</span> . <span style="color: rgb(255, 0, 0);">"spec %f"</span><span style="color: rgb(102, 204, 102);">)</span><br /> <span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(255, 0, 0);">"<span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>.rb$"</span> . <span style="color: rgb(255, 0, 0);">"ruby %f"</span><span style="color: rgb(102, 204, 102);">)</span><br /> <span style="color: rgb(102, 204, 102);">(</span>emacs-lisp-mode . <span style="color: rgb(102, 204, 102);">(</span>emacs-lisp-byte-compile<span style="color: rgb(102, 204, 102);">)</span><span style="color: rgb(102, 204, 102);">)</span><br /> <span style="color: rgb(102, 204, 102);">(</span>html-mode . <span style="color: rgb(102, 204, 102);">(</span>browse-url-of-buffer<span style="color: rgb(102, 204, 102);">)</span><span style="color: rgb(102, 204, 102);">)</span><br /> <span style="color: rgb(102, 204, 102);">(</span>html-helper-mode . <span style="color: rgb(102, 204, 102);">(</span>browse-url-of-buffer<span style="color: rgb(102, 204, 102);">)</span><span style="color: rgb(102, 204, 102);">)</span><br /> <span style="color: rgb(102, 204, 102);">(</span><span style="color: rgb(255, 0, 0);">"<span style="color: rgb(0, 0, 153); font-weight: bold;">\\</span>.skb$"</span> . <span style="color: rgb(255, 0, 0);">"skribe %f -o %n.html"</span><span style="color: rgb(102, 204, 102);">)</span><br /> <span style="color: rgb(102, 204, 102);">(</span>haskell-mode . <span style="color: rgb(255, 0, 0);">"ghc -o %n %f"</span><span style="color: rgb(102, 204, 102);">)</span><br /> <span style="color: rgb(102, 204, 102);">(</span>asy-mode . <span style="color: rgb(102, 204, 102);">(</span>call-interactively 'asy-compile-view<span style="color: rgb(102, 204, 102);">)</span><span style="color: rgb(102, 204, 102);">)</span><br /> <span style="color: rgb(102, 204, 102);">(</span>muse-mode . <span style="color: rgb(102, 204, 102);">(</span>call-interactively 'muse-project-publish<span style="color: rgb(102, 204, 102);">)</span><span style="color: rgb(102, 204, 102);">)</span><span style="color: rgb(102, 204, 102);">)</span><span style="color: rgb(102, 204, 102);">)</span><br /><span style="color: rgb(102, 204, 102);">(</span>global-set-key <span style="color: rgb(102, 204, 102);">(</span>kbd <span style="color: rgb(255, 0, 0);">"<f9>"</span><span style="color: rgb(102, 204, 102);">)</span> 'smart-compile<span style="color: rgb(102, 204, 102);">)</span></pre></div></div> <p>I set the global shortcut to <code>f9</code>. Now just name your spec files with <code>foo_spec.rb</code> (this is the convention). When you are in the buffer, just press <code>f9</code>. It will prompt you the correct command to run your specs. </p> <p>Wish you enjoy it! <img src="http://pluskid.lifegoo.com/wp-content/plugins/smilies-themer/adiumicons/happy.png" alt=":)" class="wp-smiley" /> </p>pluskidhttp://www.blogger.com/profile/17997317415745134928noreply@blogger.com1